abcrf {abcrf}R Documentation

Create an ABC-RF object: a classification random forest from a reference table towards performing an ABC model choice


abcrf constructs a random forest from a reference table towards performing an ABC model choice. Basically, the reference table (i.e. the dataset that will be treated with the present package) includes a column with the index of the models to be compared and additional columns corresponding to the values of the simulated summary statistics.


## S3 method for class 'formula'
abcrf(formula, data, group=list(), lda=TRUE, ntree=500, sampsize=min(1e5, nrow(data)),
paral=FALSE, ncores= if(paral) max(detectCores()-1,1) else 1, ...)



a formula: left of ~, variable representing the model index; right of ~, summary statistics of the reference table.


a data frame containing the reference table.


a list containing groups (at least 2) of model(s) on which the model choice will be performed. This is not necessarily a partition, one or more models can be excluded from the elements of the list and by default no grouping is done.


should LDA scores be added to the list of summary statistics?


number of trees to grow in the forest, by default 500 trees.


size of the sample from the reference table to grow a tree of the classification forest, by default the minimum between the number of elements of the reference table and 100,000.


a boolean that indicates if the calculations of the classification random forest (forest used to assign a model to the observed dataset) should be parallelized.


the number of CPU cores to use. If paral=TRUE, it is used the number of CPU cores minus 1. If ncores is not specified and detectCores does not detect the number of CPU cores with success then 1 core is used.


additional arguments to be passed on to ranger used to construct the classification random forest that preditcs the selected model.


An object of class abcrf, which is a list with the following components:


the original call to abcrf,


a boolean indicating if LDA scores have been added to the list of summary statistics,


the formula used to construct the classification random forest,


a list contining the groups of model(s) used. This list is empty if no grouping has been performed,


an object of class randomForest containing the trained forest with the reference table,


an object of class lda containing the Linear Discriminant Analysis based on the reference table,


prior error rates of model selection on the reference table, estimated with the "out-of-bag" error of the forest.


Pudlo P., Marin J.-M., Estoup A., Cornuet J.-M., Gautier M. and Robert, C. P. (2016) Reliable ABC model choice via random forests Bioinformatics

Estoup A., Raynal L., Verdu P. and Marin J.-M. (2018) Model choice using Approximate Bayesian Computation and Random Forests: analyses based on model grouping to make inferences about the genetic history of Pygmy human populations Jounal de la Société Française de Statistique

See Also

plot.abcrf, predict.abcrf, err.abcrf, ranger


modindex <- snp$modindex[1:500]
sumsta <- snp$sumsta[1:500,]
data1 <- data.frame(modindex, sumsta)
model.rf1 <- abcrf(modindex~., data = data1, ntree=100)
model.rf2 <- abcrf(modindex~., data = data1, group = list(c("1","2"),"3"), ntree=100)

[Package abcrf version 1.8.1 Index]