SurvCART {LongCART}R Documentation

Survival CART with time to event response via binary partitioning

Description

Recursive partitioning for linear mixed effects model with survival data per SurvCART algorithm based on baseline partitioning variables (Kundu, 2020).

Usage

SurvCART(data, patid, timevar, censorvar, gvars, tgvars, 
         time.dist="exponential", cens.dist="NA", event.ind=1, 
         alpha=0.05, minsplit=40, minbucket=20, quantile=0.50, print=FALSE)

Arguments

data

name of the dataset. It must contain variable specified for patid (indicating subject id), all the variables specified in the formula and the baseline partitioning variables.

patid

name of the subject id variable.

timevar

name of the variable with follow-up times.

censorvar

name of the variable with censoring status.

gvars

list of partitioning variables of interest. Value of these variables should not change over time. Regarding categorical variables, only numerically coded categorical variables should be specified. For nominal categorical variables or factors, please first create corresponding dummy variable(s) and then pass through gvars.

tgvars

types (categorical or continuous) of partitioning variables specified in gvar. For each of continuous partitioning variables, specify 1 and for each of the categorical partitioning variables, specify 0. Length of tgvars should match to the length of gvars

time.dist

name of time-to-event distribution. It can be one of the following distributions: "exponential", "weibull", "lognormal" or "normal".

cens.dist

name of censoring distribution. It can be one of the following distributions: "exponential", "weibull", "lognormal", "normal" or "NA". If specified "NA", then parameter instability test corresponding to censoring distribution will not be performed.

event.ind

value of the censoring variable indicating event.

alpha

alpha (i.e., nominal type I error) level for parameter instability test

minsplit

the minimum number of observations that must exist in a node in order for a split to be attempted.

minbucket

the minimum number of observations in any terminal node.

quantile

The quantile to be displayed in the visualization of tree through plot.SurvCART() or plot().

print

if TRUE, then summary such as number of subjects at risk, number of events, median event time and median censoring time model will be printed for each node.

Details

Construct survival tree based on heterogeneity in time-to-event and censoring distributions.

Exponential distribution: f(t)=lambda*exp(-lambda*t)

Weibull distribution: f(t)=alpha*lambda*t^(alpha-1)*exp(-lambda*t^alpha)

Lognormal distribution: f(t)=(1/t)*(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(log(t)-mu)/sigma^2]

Normal distribution: f(t)=(1/sqrt(2*pi*sigma^2))*exp[-(1/2)*(t-mu)/sigma^2]

Value

Treeout

contains summary information of tree fitting for each terminal nodes and non-terminal nodes. Columns of Treeout include "ID", the (unique) node numbers that follow a binary ordering indexed by node depth, n, the number of subjectsreaching the node, D, the number of events reaching the node, median.T, the median survival time at the node, median.C, the median censoring time at the node, var, splitting variable, index, the cut-off value of splitting variable for binary partitioning, p (Instability), the p-value for parameter instability test for the splitting variable, loglik, the log-likelihood at the node, AIC, the AIC at the node, improve, the improvement in deviance given by this split, and Terminal, indicator (True or False) of terminal node.

logLik.tree

log-likelihood of the tree-structured model, based on Cox model including sub-groups as covariates

logLik.root

log-likelihood at the root node (i.e., without tree structure), based on Cox model without any covariate

AIC.tree

AIC of the tree-structured model, based on Cox model including sub-groups as covariates

AIC.root

AIC at the root node (i.e., without tree structure), based on Cox model without any covariate

nodelab

List of subgroups or terminal nodes with their description

varnam

List of splitting variables

ds

the dataset originally supplied

event.ind

value of the censoring variable indicating event.

timevar

name of the variable with follow-up times

censorvar

name of the variable with censoring status

frame

rpart compatible object

splits

rpart compatible object

cptable

rpart compatible object

functions

rpart compatible object

Author(s)

Madan Gopal Kundu madan_g.kundu@yahoo.com

References

Kundu, M. G., and Ghosh, S. (2021). Survival trees based on heterogeneity in time-to-event and censoring distributions using parameter instability test. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(5), 466-483.

See Also

plot, KMPlot, text, StabCat.surv, StabCont.surv

Examples



#--- Get the data
data(GBSG2)

#numeric coding of character variables
GBSG2$horTh1<- as.numeric(GBSG2$horTh)
GBSG2$tgrade1<- as.numeric(GBSG2$tgrade)
GBSG2$menostat1<- as.numeric(GBSG2$menostat)

#Add subject id
GBSG2$subjid<- 1:nrow(GBSG2)

#--- Run SurvCART() with time-to-event distribution: exponential, censoring distribution: None  
out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time", 
        gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'),  
        tgvars=c(0,1,0,1,0,1, 1,1),          
        event.ind=1,  alpha=0.05, minsplit=80, minbucket=40, print=TRUE)

#--- Plot tree
par(xpd = TRUE)
plot(out, compress = TRUE)
text(out, use.n = TRUE)

#Plot KM plot for sub-groups identified by tree
KMPlot(out, xscale=365.25, type=1)
KMPlot(out, xscale=365.25, type=2, overlay=FALSE, mfrow=c(2,2), xlab="Year", ylab="Survival prob.")


#--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: None  
out2<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time",  
        gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'),  
        tgvars=c(0,1,0,1,0,1, 1,1),          
        time.dist="weibull", event.ind=1, alpha=0.05, minsplit=80, minbucket=40, print=TRUE)


#--- Run SurvCART() with time-to-event distribution: weibull censoring distribution: exponential
out<- SurvCART(data=GBSG2, patid="subjid", censorvar="cens", timevar="time",  
        gvars=c('horTh1', 'age', 'menostat1', 'tsize', 'tgrade1', 'pnodes', 'progrec', 'estrec'),  
        tgvars=c(0,1,0,1,0,1, 1,1),          
        time.dist="weibull", cens.dist="exponential", event.ind=1, 
        alpha=0.05, minsplit=80, minbucket=40, print=TRUE)


[Package LongCART version 3.2 Index]