abcrf {abcrf} | R Documentation |
Create an ABC-RF object: a classification random forest from a reference table towards performing an ABC model choice
Description
abcrf
constructs a random forest from a reference table towards performing
an ABC model choice. Basically, the reference table (i.e. the dataset that will
be treated with the present package) includes a column with the index
of the models to be compared and additional columns corresponding
to the values of the simulated summary statistics.
Usage
## S3 method for class 'formula'
abcrf(formula, data, group=list(), lda=TRUE, ntree=500, sampsize=min(1e5, nrow(data)),
paral=FALSE, ncores= if(paral) max(detectCores()-1,1) else 1, ...)
Arguments
formula |
a formula: left of ~, variable representing the model index; right of ~, summary statistics of the reference table. |
data |
a data frame containing the reference table. |
group |
a list containing groups (at least 2) of model(s) on which the model choice will be performed. This is not necessarily a partition, one or more models can be excluded from the elements of the list and by default no grouping is done. |
lda |
should LDA scores be added to the list of summary statistics? |
ntree |
number of trees to grow in the forest, by default 500 trees. |
sampsize |
size of the sample from the reference table to grow a tree of the classification forest, by default the minimum between the number of elements of the reference table and 100,000. |
paral |
a boolean that indicates if the calculations of the classification random forest (forest used to assign a model to the observed dataset) should be parallelized. |
ncores |
the number of CPU cores to use. If paral=TRUE, it is used the number of CPU cores minus 1. If ncores is not specified and |
... |
additional arguments to be passed on to |
Value
An object of class abcrf
, which is a list with the
following components:
call |
the original call to |
lda |
a boolean indicating if LDA scores have been added to the list of summary statistics, |
formula |
the formula used to construct the classification random forest, |
group |
a list contining the groups of model(s) used. This list is empty if no grouping has been performed, |
model.rf |
an object of class |
model.lda |
an object of class |
prior.err |
prior error rates of model selection on the reference table, estimated with the "out-of-bag" error of the forest. |
References
Pudlo P., Marin J.-M., Estoup A., Cornuet J.-M., Gautier M. and Robert, C. P. (2016) Reliable ABC model choice via random forests Bioinformatics doi:10.1093/bioinformatics/btv684
Estoup A., Raynal L., Verdu P. and Marin J.-M. (2018) Model choice using Approximate Bayesian Computation and Random Forests: analyses based on model grouping to make inferences about the genetic history of Pygmy human populations Jounal de la Société Française de Statistique http://journal-sfds.fr/article/view/709
See Also
plot.abcrf
,
predict.abcrf
,
err.abcrf
,
ranger
Examples
data(snp)
modindex <- snp$modindex[1:500]
sumsta <- snp$sumsta[1:500,]
data1 <- data.frame(modindex, sumsta)
model.rf1 <- abcrf(modindex~., data = data1, ntree=100)
model.rf1
model.rf2 <- abcrf(modindex~., data = data1, group = list(c("1","2"),"3"), ntree=100)
model.rf2