tvcm {vcrpart} | R Documentation |
Tree-based varying coefficient regression models
Description
tvcm
is the general implementation for tree-based
varying coefficient regression. It may be used to combine the two
different algorithms tvcolmm
and
tvcglm
.
Usage
tvcm(formula, data, fit, family,
weights, subset, offset, na.action = na.omit,
control = tvcm_control(), fitargs, ...)
Arguments
formula |
a symbolic description of the model to fit, e.g.,
where |
fit |
a character string or a function that specifies the fitting
function, e.g., |
family |
the model family, e.g., an object of class
|
data |
a data frame containing the variables in the model. |
weights |
an optional numeric vector of weights to be used in the fitting process. |
subset |
an optional logical or integer vector specifying a
subset of |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. |
na.action |
a function that indicates what should happen if data
contain |
control |
a list with control parameters as returned by
|
fitargs |
additional arguments passed to the fitting function
|
... |
additional arguments passed to the fitting function
|
Details
TVCM partitioning works as follows: In each iteration we fit the
current model and select a binary split for one of the current
terminal nodes. The selection requires 4 decisions: the vc
term, the node, the variable and the cutpoint in the selected
variable. The algorithm starts with M_k = 1
node for each of the
K
vc
terms and iterates until the criteria
defined by control
are reached, see
tvcm_control
. For the specific criteria for the split
selection, see tvcolmm
and tvcglm
.
Alternative tree-based algorithm to tvcm
are the
MOB (Zeileis et al., 2008) and the PartReg (Wang and Hastie, 2014)
algorithms. The MOB algorithm is implemented by the mob
function in the packages party and partykit. For smoothing
splines and kernel regression approaches to varying coefficients, see
the packages mgcv, svcm,mboost or np.
The tvcm
function builds on the software
infrastructure of the partykit package. The authors are grateful
for these codes.
Value
An object of class tvcm
. The
tvcm
class itself is based on the
party
class of the partykit package. The most
important slots are:
node |
an object of class |
data |
a |
fitted |
an optional |
info |
additional information including |
Author(s)
Reto Burgin
References
Zeileis, A., T. Hothorn, and K. Hornik (2008). Model-Based Recursive Partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514.
Wang, J. C. and T. Hastie (2014), Boosted Varying-Coefficient Regression Models for Product Demand Prediction, Journal of Computational and Graphical Statistics, 23(2), 361–382.
Hothorn, T. and A. Zeileis (2014). partykit: A Modular Toolkit for Recursive Partytioning in R. In Working Papers in Economics and Statistics, Research Platform Empirical and Experimental Economics, Number 2014-10. Universitaet Innsbruck.
Burgin R. and Ritschard G. (2015), Tree-Based Varying Coefficient Regression for Longitudinal Ordinal Responses. Computational Statistics & Data Analysis, 86, 65–80.
Burgin, R. A. (2015b). Tree-based methods for moderated regression with application to longitudinal data. PhD thesis. University of Geneva.
Burgin, R. and G. Ritschard (2017), Coefficient-Wise Tree-Based Varying Coefficient Regression with vcrpart. Journal of Statistical Software, 80(6), 1–33.
See Also
tvcolmm
, tvcglm
,
tvcm_control
, tvcm-methods
,
tvcm-plot
, tvcm-assessment
Examples
## ------------------------------------------------------------------- #
## Example 1: Moderated effect of education on poverty
##
## See the help of 'tvcglm'.
## ------------------------------------------------------------------- #
data(poverty)
poverty$EduHigh <- 1 * (poverty$Edu == "high")
## fit the model
model.Pov <-
tvcm(Poor ~ -1 + vc(CivStat) + vc(CivStat, by = EduHigh) + NChild,
family = binomial(), data = poverty, subset = 1:200,
control = tvcm_control(verbose = TRUE, papply = "lapply",
folds = folds_control(K = 1, type = "subsampling", seed = 7)))
## diagnosis
plot(model.Pov, "cv")
plot(model.Pov, "coef")
summary(model.Pov)
splitpath(model.Pov, steps = 1:3)
prunepath(model.Pov, steps = 1)
## ------------------------------------------------------------------- #
## Example 2: Moderated effect effect of unemployment
##
## See the help of 'tvcolmm'.
## ------------------------------------------------------------------- #
data(unemp)
## fit the model
model.UE <-
tvcm(GHQL ~ -1 +
vc(AGE, FISIT, GENDER, UEREGION, by = UNEMP, intercept = TRUE) +
re(1|PID),
data = unemp, control = tvcm_control(sctest = TRUE),
family = cumulative())
## diagnosis (no cross-validation was performed since 'sctest = TRUE')
plot(model.UE, "coef")
summary(model.UE)
splitpath(model.UE, steps = 1, details = TRUE)