plotCurves {rockchalk} | R Documentation |
Assists creation of predicted value curves for regression models.
Description
Creates a predicted value plot that includes a separate predicted
value line for each value of a focal variable. The x axis variable
is specified by the plotx
argument. As of rockchalk 1.7.x,
the moderator argument, modx, is optional. Think of this a new
version of R's termplot
, but it allows for
interactions. And it handles some nonlinear transformations more
gracefully than termplot.
Usage
plotCurves(
model,
plotx,
nx = 40,
modx,
plotxRange = NULL,
n,
modxVals = NULL,
interval = c("none", "confidence", "prediction"),
plotPoints = TRUE,
plotLegend = TRUE,
legendTitle = NULL,
legendPct = TRUE,
col = c("black", "blue", "darkgreen", "red", "orange", "purple", "green3"),
llwd = 2,
opacity = 100,
envir = environment(formula(model)),
...
)
Arguments
model |
Required. Fitted regression object. Must have a predict method |
plotx |
Required. String with name of predictor for the x axis |
nx |
Number of values of plotx at which to calculate the predicted value. Default = 40. |
modx |
Optional. String for moderator variable name. May be either numeric or factor. |
plotxRange |
Optional. If not specified, the observed range of plotx will be used to determine the axis range. |
n |
Optional. Number of focal values of |
modxVals |
Optional. A vector of focal values for which
predicted values are to be plotted. May also be a character string
to select an algorithm ("quantile","std.dev." or "table"), or a
user-supplied function to select focal values (a new method
similar to |
interval |
Optional. Intervals provided by the
|
plotPoints |
Optional. TRUE or FALSE: Should the plot include the scatterplot points along with the lines. |
plotLegend |
Optional. TRUE or FALSE: Should the default legend be included? |
legendTitle |
Optional. You'll get an automatically generated title, such as "Moderator: modx", but if you don't like that, specify your own string here. |
legendPct |
Default = TRUE. Variable labels print with sample percentages. |
col |
I offer my preferred color vector as default.
Replace if you like. User may supply a vector of valid
color names, or |
llwd |
Optional. Line widths for predicted values. Can be single value or a vector, which will be recycled as necessary. |
opacity |
Optional, default = 100. A number between 1 and 255. 1 means "transparent" or invisible, 255 means very dark. the darkness of confidence interval regions |
envir |
environment to search for variables. |
... |
further arguments that are passed to plot or predict. The arguments that are monitored to be sent to predict are c("type", "se.fit", "dispersion", "interval", "level", "terms", "na.action"). |
Details
This is similar to plotSlopes
, but it accepts regressions
in which there are transformed variables, such as "log(x1)".
It creates a plot of the predicted dependent
variable against one of the numeric predictors, plotx
. It
draws a predicted value line for each value of modx
, a
moderator variable. The moderator may be a numeric or categorical
moderator variable.
The user may designate which particular values of the moderator
are used for calculating the predicted value lines. That is,
modxVals = c( 12,22,37)
would draw lines for values 12, 22,
and 37 of the moderator. User may instead supply a character
string to choose one of the built in algorithms. The default
algorithm is "quantile", which will select n
values that
are evenly spaced along the modx
axis. The algorithm
"std.dev" will select the mean of modx
(m) and then it will
select values that step away from the mean in standard deviation
sd units. For example, if n = 3
, the focal
values will m, m - sd, am + sd
.
Value
A plot is created as a side effect, a list is returned including 1) the call, 2) a newdata object that includes information on the curves that were plotted, 3) a vector modxVals, the values for which curves were drawn.
Author(s)
Paul E. Johnson pauljohn@ku.edu
Examples
library(rockchalk)
## Replicate some R classics. The budworm.lg data from predict.glm
## will work properly after re-formatting the information as a data.frame:
## example from Venables and Ripley (2002, pp. 190-2.)
df <- data.frame(ldose = rep(0:5, 2),
sex = factor(rep(c("M", "F"), c(6, 6))),
SF.numdead = c(1, 4, 9, 13, 18, 20, 0, 2, 6, 10, 12, 16))
df$SF.numalive = 20 - df$SF.numdead
budworm.lg <- glm(cbind(SF.numdead, SF.numalive) ~ sex*ldose, data = df,
family = binomial)
plotCurves(budworm.lg, plotx = "ldose", modx = "sex", interval = "confidence",
ylim = c(0, 1))
## See infert
model2 <- glm(case ~ age + parity + education + spontaneous + induced,
data = infert, family = binomial())
## Curvature so slight we can barely see it
model2pc1 <- plotCurves(model2, plotx = "age", modx = "education",
interval = "confidence", ylim = c(0, 1))
model2pc2 <- plotCurves(model2, plotx = "age", modx = "education",
modxVals = levels(infert$education)[1],
interval = "confidence", ylim = c(0, 1))
model2pc2 <- plotCurves(model2, plotx = "age", modx = "education",
modxVals = levels(infert$education)[c(2,3)],
interval = "confidence", ylim = c(0, 1))
model2pc2 <- plotCurves(model2, plotx = "age", modx = "education",
modxVals = levels(infert$education)[c(2,3)],
ylim = c(0, 1), type = "response")
## Manufacture some data
set.seed(12345)
N <- 500
dat <- genCorrelatedData2(N = 500, means = c(5, 0, 0, 0), sds = rep(1, 4),
rho = 0.2, beta = rep(1, 5), stde = 20)
dat$xcat1 <- gl(2, N/2, labels = c("Monster", "Human"))
dat$xcat2 <- cut(rnorm(N), breaks = c(-Inf, 0, 0.4, 0.9, 1, Inf),
labels = c("R", "M", "D", "P", "G"))
###The design matrix for categorical variables, xcat numeric
dat$xcat1n <- with(dat, contrasts(xcat1)[xcat1, ])
dat$xcat2n <- with(dat, contrasts(xcat2)[xcat2, ])
stde <- 2
dat$y <- with(dat, 0.03 + 11.5 * log(x1) * contrasts(dat$xcat1)[dat$xcat1] +
0.1 * x2 + 0.04 * x2^2 + stde*rnorm(N))
stde <- 1
dat$y2 <- with(dat, 0.03 + 0.1 * x1 + 0.1 * x2 + 0.25 * x1 * x2 + 0.4 * x3 -
0.1 * x4 + stde * rnorm(N))
stde <- 8
dat$y3 <- with(dat, 3 + 0.5 * x1 + 1.2 * (as.numeric(xcat1)-1) +
-0.8 * (as.numeric(xcat1)-1) * x1 + stde * rnorm(N))
stde <- 8
dat$y4 <- with(dat, 3 + 0.5 * x1 +
contrasts(dat$xcat2)[dat$xcat2, ] %*% c(0.1, -0.2, 0.3, 0.05) +
stde * rnorm(N))
## Curvature with interaction
m1 <- lm(y ~ log(x1)*xcat1 + x2 + I(x2^2), data=dat)
summary(m1)
## First, with no moderator
plotCurves(m1, plotx = "x1")
plotCurves(m1, plotx = "x1", modx = "xcat1")
## ## Verify that plot by comparing against a manually contructed alternative
## par(mfrow=c(1,2))
## plotCurves(m1, plotx = "x1", modx = "xcat1")
## newdat <- with(dat, expand.grid(x1 = plotSeq(x1, 30), xcat1 = levels(xcat1)))
## newdat$x2 <- with(dat, mean(x2, na.rm = TRUE))
## newdat$m1p <- predict(m1, newdata = newdat)
## plot( y ~ x1, data = dat, type = "n", ylim = magRange(dat$y, c(1, 1.2)))
## points( y ~ x1, data = dat, col = dat$xcat1, cex = 0.4, lwd = 0.5)
## by(newdat, newdat$xcat1, function(dd) {lines(dd$x1, dd$m1p)})
## legend("topleft", legend=levels(dat$xcat1), col = as.numeric(dat$xcat1), lty = 1)
## par(mfrow = c(1,1))
## ##Close enough!
plotCurves(m1, plotx = "x2", modx = "x1")
plotCurves(m1, plotx = "x2", modx = "xcat1")
plotCurves(m1, plotx = "x2", modx = "xcat1", interval = "conf")
m2 <- lm(y ~ log(x1)*xcat1 + xcat1*(x2 + I(x2^2)), data = dat)
summary(m2)
plotCurves(m2, plotx = "x2", modx = "xcat1")
plotCurves(m2, plotx ="x2", modx = "x1")
m3a <- lm(y ~ poly(x2, 2) + xcat1, data = dat)
plotCurves(m3a, plotx = "x2")
plotCurves(m3a, plotx = "x2", modx = "xcat1")
#OK
m4 <- lm(log(y+10) ~ poly(x2, 2)*xcat1 + x1, data = dat)
summary(m4)
plotCurves(m4, plotx = "x2")
plotCurves(m4, plotx ="x2", modx = "xcat1")
plotCurves(m4, plotx = "x2", modx = "x1")
plotCurves(m4, plotx = "x2", modx = "xcat1")
plotCurves(m4, plotx = "x2", modx = "xcat1", modxVals = c("Monster"))
##ordinary interaction
m5 <- lm(y2 ~ x1*x2 + x3 +x4, data = dat)
summary(m5)
plotCurves(m5, plotx = "x1", modx = "x2")
plotCurves(m5, plotx = "x1", modx = "x2", modxVals = c( -2, -1, 0, 1, 2))
plotCurves(m5, plotx = "x1", modx = "x2", modxVals = c(-2))
plotCurves(m5, plotx = "x1", modx = "x2", modxVals = "std.dev.")
plotCurves(m5, plotx = "x1", modx = "x2", modxVals = "quantile")
plotCurves(m5, plotx = "x3", modx = "x2")
if(require(carData)){
mc1 <- lm(statusquo ~ income * sex, data = Chile)
summary(mc1)
plotCurves(mc1, plotx = "income")
plotCurves(mc1, modx = "sex", plotx = "income")
plotCurves(mc1, modx = "sex", plotx = "income", modxVals = "M")
mc2 <- lm(statusquo ~ region * income, data = Chile)
summary(mc2)
plotCurves(mc2, modx = "region", plotx = "income")
plotCurves(mc2, modx = "region", plotx = "income",
modxVals = levels(Chile$region)[c(1,4)])
plotCurves(mc2, modx = "region", plotx = "income", modxVals = c("S","M","SA"))
plotCurves(mc2, modx = "region", plotx = "income", modxVals = c("S","M","SA"),
interval = "conf")
plotCurves(mc2, modx = "region", plotx = "income", plotPoints = FALSE)
mc3 <- lm(statusquo ~ region * income + sex + age, data = Chile)
summary(mc3)
plotCurves(mc3, modx = "region", plotx = "income")
mc4 <- lm(statusquo ~ income * (age + I(age^2)) + education + sex + age, data = Chile)
summary(mc4)
plotCurves(mc4, plotx = "age")
plotCurves(mc4, plotx = "age", interval = "conf")
plotCurves(mc4, plotx = "age", modx = "income")
plotCurves(mc4, plotx = "age", modx = "income", plotPoints = FALSE)
plotCurves(mc4, plotx = "income", modx = "age")
plotCurves(mc4, plotx = "income", modx = "age", n = 8)
plotCurves(mc4, plotx = "income", modx = "age", modxVals = "std.dev.")
plotCurves(mc4, modx = "income", plotx = "age", plotPoints = FALSE)
}