| trafotree {trtf} | R Documentation | 
Transformation Trees
Description
Partitioned transformation models
Usage
trafotree(object, parm = 1:length(coef(object)), reparm = NULL,
          intercept = c("none", "shift", "scale", "shift-scale"),
          min_update = length(coef(object)) * 2,
          mltargs = list(), ...)
## S3 method for class 'trafotree'
predict(object, newdata, K = 20, q = NULL,
    type = c("node", "coef", "trafo", "distribution", "survivor", "density",
             "logdensity", "hazard", "loghazard", "cumhazard", "quantile"),
    perm = NULL, ...)
## S3 method for class 'trafotree'
logLik(object, newdata, weights = NULL, perm = NULL, ...)
Arguments
| object | an object of class  | 
| parm | parameters of  | 
| reparm | optional matrix of contrasts for reparameterisation of the scores.
 | 
| intercept | add optional intercept parameters (constraint to zero) to
the model. It may make sense to restrict attention to
scores corresponding to those intercept parameters, the additional argument
 | 
| min_update | number of observations necessary to refit the model in a node. If less observations are available, the parameters from the parent node will be reused. | 
| mltargs | arguments to  | 
| newdata | an optional data frame of observations. | 
| K | number of grid points to generate (in the absence of  | 
| q | quantiles at which to evaluate the model. | 
| type | type of prediction or plot to generate. | 
| weights | an optional vector of weights. | 
| perm | a vector of integers specifying the variables to be permuted
prior before splitting (i.e., for computing permutation
variable importances). The default  | 
| ... | arguments to  | 
Details
Conditional inference trees are used for partitioning likelihood-based transformation
models as described in Hothorn and Zeileis (2017). The method can be seen
in action in Hothorn (2018) and the corresponding code is available as
demo("BMI"). demo("applications") performs transformation
tree analyses for some standard benchmarking problems.
Value
An object of class trafotree with corresponding plot, logLik and
predict methods.
References
Torsten Hothorn and Achim Zeileis (2021). Predictive Distribution Modelling Using Transformation Forests. Journal of Computational and Graphical Statistics, doi:10.1080/10618600.2021.1872581.
Torsten Hothorn (2018). Top-Down Transformation Choice. Statistical Modelling, 3-4, 274-298. doi:10.1177/1471082X17748081
Natalia Korepanova, Heidi Seibold, Verena Steffen and Torsten Hothorn (2019). Survival Forests under Test: Impact of the Proportional Hazards Assumption on Prognostic and Predictive Forests for ALS Survival. doi:10.1177/0962280219862586.
Examples
### Example: Stratified Medicine Using Partitioned Cox-Models
### A combination of <DOI:10.1515/ijb-2015-0032> and <arXiv:1701.02110>
### based on infrastructure in the mlt R add-on package described in
### https://cran.r-project.org/web/packages/mlt.docreg/vignettes/mlt.pdf
library("trtf")
library("survival")
### German Breast Cancer Study Group 2 data set
data("GBSG2", package = "TH.data")
GBSG2$y <- with(GBSG2, Surv(time, cens))
### set-up Cox model with overall treatment effect in hormonal therapy
cmod <- Coxph(y ~ horTh, data = GBSG2, support = c(100, 2000), order = 5)
### overall log-hazard ratio
coef(cmod)
### roughly the same as 
coef(coxph(y ~ horTh, data = GBSG2))
### partition the model, ie both the baseline hazard function AND the
### treatment effect
(part_cmod <- trafotree(cmod, formula = y ~ horTh | age + menostat + tsize + 
    tgrade + pnodes + progrec + estrec, data = GBSG2))
### compare the log-likelihoods
logLik(cmod)
logLik(part_cmod)
### stronger effects in nodes 2 and 4 and no effect in node 5
coef(part_cmod)[, "horThyes"]
### plot the conditional survivor functions; blue is untreated
### and green is hormonal therapy
nd <- data.frame(horTh = sort(unique(GBSG2$horTh)))
plot(part_cmod, newdata = nd, 
     tp_args = list(type = "survivor", col = c("cadetblue3", "chartreuse4")))
### same model, but with explicit intercept term and max-type statistic
### for _variable_ selection
(part_cmod_max <- trafotree(cmod, formula = y ~ horTh | age + menostat + tsize + 
    tgrade + pnodes + progrec + estrec, data = GBSG2, intercept = "shift",
    control = ctree_control(teststat = "max")))
logLik(part_cmod_max)
coef(part_cmod_max)[, "horThyes"]
### the trees (and log-likelihoods are the same) but the
### p-values are sometimes much smaller in the latter tree
cbind(format.pval(info_node(node_party(part_cmod))$criterion["p.value",]),
      format.pval(info_node(node_party(part_cmod_max))$criterion["p.value",]))