RFcluster {gamclass} | R Documentation |
Random forests estimate of predictive accuracy for clustered data
Description
This function adapts random forests to work (albeit clumsily and inefficiently) with clustered categorical outcome data. For example, there may be multiple observations on individuals (clusters). Predictions are made fof the OOB (out of bag) clusters
Usage
RFcluster(formula, id, data, nfold = 15,
ntree=500, progress=TRUE, printit = TRUE, seed = 29)
Arguments
formula |
Model formula |
id |
numeric, identifies clusters |
data |
data frame that supplies the data |
nfold |
numeric, number of folds |
ntree |
numeric, number of trees (number of bootstrap samples) |
progress |
Print information on progress of calculations |
printit |
Print summary information on accuracy |
seed |
Set seed, if required, so that results are exactly reproducible |
Details
Bootstrap samples are taken of observations in the in-bag clusters. Predictions are made for all observations in the OOB clusters.
Value
class |
Predicted values from cross-validation |
OOBaccuracy |
Cross-validation estimate of accuracy |
confusion |
Confusion matrix |
Author(s)
John Maindonald
References
https://maths-people.anu.edu.au/~johnm/nzsr/taws.html
Examples
## Not run:
library(mlbench)
library(randomForest)
data(Vowel)
RFcluster(formula=Class ~., id = V1, data = Vowel, nfold = 15,
ntree=500, progress=TRUE, printit = TRUE, seed = 29)
## End(Not run)