best.tree.CV {GPLTR} | R Documentation |
Prunning the Maximal tree
Description
this function is set to prune back the maximal tree by using a K-fold cross-validation
procedure.
Usage
best.tree.CV(xtree, xdata, Y.name, X.names, G.names, family = "binomial",
args.rpart = list(cp = 0, minbucket = 20, maxdepth = 10), epsi = 0.001,
iterMax = 5, iterMin = 3, ncv = 10, verbose = TRUE)
Arguments
xtree |
a tree to prune |
xdata |
the dataset used to build the tree |
Y.name |
the name of the dependent variable |
X.names |
the names of independent variables to consider in the linear part of the |
G.names |
the names of independent variables to consider in the tree part of the hybrid |
family |
the |
args.rpart |
a list of options that control details of the rpart algorithm. |
epsi |
a treshold value to check the convergence of the algorithm |
iterMax |
the maximal number of iteration to consider |
iterMin |
the minimum number of iteration to consider |
ncv |
The number of folds to consider for the |
verbose |
Logical; TRUE for printing progress during the computation (helpful for debugging) |
Value
a list of five elements:
best_index |
The size of the selected tree by the cross-validation procedure |
tree |
The selected tree by |
fit_glm |
The fitted gpltr models selected with |
CV_ERRORS |
A list of two elements containing the cross-validation error of the selected tree by the |
Timediff |
The execution time of the |
Author(s)
Cyprien Mbogning
References
Mbogning, C., Perdry, H., Toussile, W., Broet, P.: A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities. Journal of Clinical Bioinformatics 4:6, (2014)
See Also
Examples
## Not run:
##load the data set
data(data_pltr)
## set the parameters
args.rpart <- list(minbucket = 40, maxdepth = 10, cp = 0)
family <- "binomial"
Y.name <- "Y"
X.names <- "G1"
G.names <- paste("G", 2:15, sep="")
## build a maximal tree
fit_pltr <- pltr.glm(data_pltr, Y.name, X.names, G.names, args.rpart = args.rpart,
family = family,iterMax = 5, iterMin = 3)
##prunned back the maximal tree by a cross-validation procedure
tree_selected <- best.tree.CV(fit_pltr$tree, data_pltr, Y.name, X.names, G.names,
family = family, args.rpart = args.rpart, epsi = 0.001, iterMax = 5,
iterMin = 3, ncv = 10)
plot(tree_selected$tree, main = 'CV TREE')
text(tree_selected$tree, minlength = 0L, xpd = TRUE, cex = .6)
## End(Not run)