psica {psica} | R Documentation |
Create a tree that discovers groups having similar treatment (intervention) effects.
Description
The PSICA method operates by first building regression trees for each treament group and then obtaining the distributions of the effect size for given levels of independent variables by either bootstrap or by means of the bias-corrected infinitesimal jackknife. The obtained distributions are used for computing the probabilities that one treatment is better (effect size is greater) than the other treatments for a given set of input values. These probabilities are then summarised in the form of a decision tree built with a special loss function. The terminal nodes of the resulting tree show the probabilities that one treatment is better than the other treatments as well as a label containing the possible best treatments.
Usage
psica(formula, data, intervention, method = "normal",
forestControl = list(minsplit = 10, mincriterion = 0.95, nBoots = 500,
nTrees = 200, mtry = 5), treeControl = rpart::rpart.control(minsplit =
20, minbucket = 10, cp = 0.003), confidence = 0.95, prune = TRUE,
...)
Arguments
formula |
Formula that shows the dependent variable (effect) and independent variables (separated by '+'). The treatment variable should not be present among dependent variables |
data |
Data frame containing dependent and independent variables and the categorical treatment variable |
intervention |
The name of the treatment variable |
method |
Choose "boot" for computing probabilities by bootstrapping random forests, "normal" for computing probabilities by appoximating random forest variance with infinitesimal jackknife with bias correction. |
forestControl |
parameters of forest growing, a list with parameters
|
treeControl |
Parameters for decision tree growing, see rpart.control() |
confidence |
Parameter that defines the cut-off probability in the loss function and also which treatments are included in the labels of the PSICA tree. More specifically, labels in the terminal nodes show all treatments except of useless treatments, i.e. the treatments that altogether have a probability to be the best which is smaller than 1-confidence. |
prune |
should the final tree be pruned or is (possibly) overfitted tree desired? |
... |
further argumets passed to rpart object. |
Value
Object of a class psicaTree
References
Sysoev O, Bartoszek K, Ekström E, Ekholm Selling K (2019). “PSICA: Decision trees for probabilistic subgroup identification with categorical treatments.” Statistics in Medicine, 38(22), 4436-4452. doi: 10.1002/sim.8308, https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.8308, https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.8308.
Examples
n=100
X1=runif(n)
X1=sort(X1)
f1<- function(x){
2*tanh(4*x-2)+3
}
X2=runif(n)
X2=sort(X2)
f2<- function(x){
2*tanh(2*x-1)+2.3 #2.8
}
plot(X1,f1(X1),ylim=c(0,5), type="l")
points(X2,f2(X2), type="l")
Y1=f1(X1)+rnorm(n, 0, 0.8)
Y2=f2(X2)+rnorm(n,0,0.8)
points(X1,Y1, col="blue")
points(X2,Y2, col="red")
data=data.frame(X=c(X1,X2), Y=c(Y1,Y2), interv=c(rep("treat",n), rep("control",n)))
pt=psica(Y~X, data=data, method="normal",intervention = "interv",
forestControl=list(nBoots=200, mtry=1))
print(pt)
plot(pt)