TSVC {TSVC} | R Documentation |
Tree-Structured Modelling of Varying Coefficients
Description
A function to fit tree-structured varying coefficient (TSVC) models. By recursive splitting the method allows to simultaneously detect covariates with varying coefficients and the effect modifiers that induce varying coefficients if they are present. The basic method is described in Berger, Tutz and Schmid (2018).
Usage
TSVC(
formula,
data,
family = gaussian,
alpha = 0.05,
nperm = 1000,
nodesize_min = 5,
bucket_min = 1,
depth_max = NULL,
splits_max = NULL,
perm_test = TRUE,
test_linear = TRUE,
effmod = NULL,
notmod = NULL,
only_effmod = NULL,
smooth = NULL,
split_intercept = FALSE,
sb_slope = NULL,
trace = FALSE,
...
)
## S3 method for class 'TSVC'
print(x, ...)
Arguments
formula |
object of class |
data |
data frame of class |
family |
a description of the error distribution and link function to be used in the model (as for |
alpha |
significance level |
nperm |
number of permutations used for the permutation tests. |
nodesize_min |
minimum number of observations that must exist in a node in order for a split to be attempted. |
bucket_min |
the minimum number of observations in any terminal node. |
depth_max |
maximum depth of any node in each tree, with the root node counted as depth 0. If |
splits_max |
maximum number of splits performed. If |
perm_test |
if |
test_linear |
should linear effects that were not modified during iteration tested for significance? |
effmod |
optional vector of covariates that serve as effect modifier. If |
notmod |
optional list of class |
only_effmod |
optional vector of covariates that serve as effect modifier, only. If |
smooth |
optional vector of covariates with a smooth effect on the response. The (smooth) effects fo these variables are not allowed to be modified. |
split_intercept |
if |
sb_slope |
optional vector of covariates that are allowed to be modified by itself. Such an effect corresponds to a structural break in the slope. |
trace |
if |
... |
further arguments passed to or from other methods. |
x |
object of class |
Details
A typical formula has the form response ~ covariates
, where response
is the name of the response variable and covariates
is a
series of variables that are incorporated in the model.
With p covariates, TSVC
expects a formula of the form y ~ x_1+...+x_p
. If no further specifications are made (effmod=NULL
, notmod=NULL
, only_effmod=NULL
) it is assumed that each covariate x_j, j = {1,...,p}
can be modified by all the other variables x_m, m = {1,...,p} \ j
.
Remark: Significance of each split is verified by permutation tests. The result of the permutation tests
can strongly depend on the number of permutations nperm
.
Note: The algorithm currently does not support splitting of/by factor variables. If a factor variable is included in the formula of the model, the variable will not serve as effect modifier and its effect will not be modified.
Value
Object of class "TSVC"
. An object of class "TSVC"
is a list containing the following components:
splits |
matrix with detailed information about all executed splits during the fitting process. |
coefficients |
list of estimated coefficients for covariates with and without varying coefficients (including a non-varying intercept). |
pvalues |
p-values of each permuation test during the fitting process. |
pvalues_linear |
p-values of the permutation tests on the linear effects in the last step of the algorithm. |
devs |
maximal value statistics |
crit |
critical values of each permutation test during the fitting process. |
y |
response vector. |
X |
matrix of all the variables (covariates and effect modifiers) for model fitting. |
sb |
variables for which a structural break in the slope was allowed. |
model |
internally fitted model in the last iteration of class |
all_models |
Author(s)
Moritz Berger <Moritz.Berger@imbie.uni-bonn.de>
https://www.imbie.uni-bonn.de/people/dr-moritz-berger/
References
Berger, M., G. Tutz and M. Schmid (2019). Tree-Structured Modelling of Varying Coefficients. Statistics and Computing 29, 217-229, https://doi.org/10.1007/s11222-018-9804-8.
Hastie, T. and R. Tibshirani (1993). Varying-coefficient models. Journal of the Royal Statistical Society B 55, 757-796.
Hothorn T., K. Hornik and A. Zeileis (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics 15(3), 651-674.
See Also
plot.TSVC
, predict.TSVC
, summary.TSVC
Examples
# Swiss Labour Market
library(AER)
data("SwissLabor")
# recode factors
sl <- SwissLabor
sl$participation <- as.numeric(sl$participation)-1
sl$foreign <- as.numeric(sl$foreign)-1
## Not run:
fit1 <- TSVC(participation~income+age, data=sl, family=binomial(link="logit"),
nperm=300, trace=TRUE)
print(fit1)
class(fit$model) # glm
# In fit2, variable 'foreign' does not serve as effect modifier
# and the effect of 'foreign' is not modified by the other variables.
# That means 'foreign' is assumed to only have simple linear effect on the response.
fit2 <- TSVC(participation~income+age+foreign, data=sl, family=binomial(link="logit"),
nperm=300, trace=TRUE, effmod=c("income","age"),
notmod=list(c("foreign","income"),c("foreign","age")))
print(fit2)
# In fit3, variable 'age' does only serve as effect modifier. That means the effect of 'age'
# is not included in the predictor of the model.
fit3 <- TSVC(participation~income+age+foreign, data=sl, family=binomial(link="logit"),
nperm=300, trace=TRUE, only_effmod="age")
print(fit3)
# In fit4, the intercept is allowed to be modified by 'age' and 'income'.
# The two covariates, however, are not allowed to modify each other.
fit4 <- TSVC(participation~income+age, data=sl, family=binomial(link="logit"),
nperm=300, trace=TRUE, split_intercept=TRUE,
notmod=list(c("income","age"), c("age", "income")))
print(fit4)
# In fit5, variable 'age' has a smooth effect on the response.
# Hence, the (smooth) effect of 'age' will not be modified by the other variables.
fit5 <- TSVC(participation~income+age+foreign, data=sl, family=binomial(link="logit"),
nperm=300, trace=TRUE, smooth="age")
print(fit5)
class(fit5$model) # gam
# In fit6, the intercept is allowed to be modified by 'age' and 'income', but the two variables are
# not included in the predictor of the model. Here, no permutation tests are performed, but the
# tree is pruned by a minimum node size constraint.
fit6 <- TSVC(participation~income+age, data=sl, family=binomial(link="logit"),
perm_test=FALSE, nodesize_min=100, bucket_min=100, trace=TRUE, split_intercept=TRUE,
effmod=c("income","age"), only_effmod = c("income", "age"))
print(fit6)
## End(Not run)