evaluates {sdm} | R Documentation |
evaluate for accuracy
Description
evaluates for accuracy
Usage
evaluates(x,p,...)
getEvaluation(x,id,wtest,stat,opt,...)
getReplication(x,id,replication,species,run,index,test)
Arguments
x |
a numeric vector or a |
p |
a numeric vector or a |
id |
a single numeric value indicates the modelID |
wtest |
which test data should be used: "training", "test.dep", or "test.indep"? |
stat |
statistics that should be extracted from the |
opt |
a numeric value indicates which threshold optimisation criteria should be considered if a threshold-based statistic is selected in stat |
species |
optional; a character vector specifies the name of species for which the replication is returned (default is NULL) |
replication |
a character specifies the name of the replication method |
run |
a single numeric value specifies the replication ID |
index |
logical (default: FALSE); specifies whether the index or species data of drawn records should be returned |
test |
logical (default: TRUE); specifies whether the test partition should be returned or training partition |
... |
additional arguments (see details) |
Details
Evaluates the preformance (accuracy) given the obsetved values, and the predicted values. As additional argument, the distribution of data can be specified (through distribution
), that can be either of 'binomial'
, 'gaussian'
, 'laplase'
, or 'poisson'
. If not specified, it will be guessed by the function!
getEvaluation
can be used to get the evaluation results from a fitted model (sdmModels
object that is output of the sdm
function). Each model in sdmModels
has a modelID, that can be specified in w
argument. If w
is not specified or more than a modelID is specified, then a data.frame is generated that contains the statistics specified in stat
. For a single model (if length w
is 1), stat
can be 1 (threhold_independent statistics), or 2 (threshold_based statistics) or NULL (both groups). If more than a model is specified (w
is either NULL or has a length greater than 1), stat can be the name of statistics such as 'AUC', 'COR', 'Deviance', 'obs.prevalence', 'threshold', 'sensitivity', 'specificity', 'TSS','MCC', 'Kappa', 'NMI', 'phi', 'ppv', 'npv', 'ccr', 'prevalence'
.
If either of the thershold_based stats are selected, opt
can be also specified to select one of the criteria for optimising the threshold. The possible value can be between 1 to 15 for "sp=se", "max(se+sp)", "min(cost)", "minROCdist", "max(kappa)", "max(ppv+npv)", "ppv=npv", "max(NMI)", "max(ccr)", "prevalence", "max(MCC)", "P10", "P5", "P1", "P0"
criteria, respectively. P10, P5, P1 refer to 10, 5, and 1 percentile of presence records in the evaluation dataset, respectively for which the suitability value is used as the threshold. By choosing P0, the minimum suitability value across presence records is selected as the threshold.
getReplication
returns portion of records randomly selected through data partitioning using one of the replication methods (e.g., 'cv', 'boot', 'sub').
Value
an object of class sdmEvaluate
from evaluates
function
a list or data.frame from getEvaluation
function
Author(s)
Babak Naimi naimi.b@gmail.com
https://www.biogeoinformatics.org/
References
Naimi, B., Araujo, M.B. (2016) sdm: a reproducible and extensible R platform for species distribution modelling, Ecography, DOI: 10.1111/ecog.01881
See Also
#
Examples
## Not run:
file <- system.file("external/model.sdm", package="sdm")
m <- read.sdm(file) # a sdmModels Object (fitted using sdm function)
getModelInfo(m)
# there are 4 models in the sdmModels objects
# so let's take a look at all the results for the model with modelID 1
# evaluation using training data (both threshod_independent and threshold_based groups):
getEvaluation(m,w=1,wtest='training')
getEvaluation(m,w=1,wtest='training',stat=1) # stat=1 (threshold_independent)
getEvaluation(m,w=1,wtest='test.dep',stat=2) # stat=2 (threshold_based)
getEvaluation(m,w=1:3,wtest='test.dep',stat=c('AUC','TSS'),opt=2)
getEvaluation(m,opt=1) # all models
getEvaluation(m,stat=c('TSS','Kappa','AUC'),opt=1) # all models
############
#example for evaluation:
evaluates(x=c(1,1,0,1,0,0,0,1,1,1,0),
p=c(0.69,0.04,0.05,0.95,0.04,0.65,0.09,0.61,0.75,0.84,0.15))
##############
# Example for getReplication:
df <- read.csv(file) # load a csv file
head(df)
d <- sdmData(sp~b15+NDVI,train=df) # sdmdata object
d
#----
# fit SDMs using 2 methods and a subsampling replication method with 2 replications:
m <- sdm(sp~b15+NDVI,data=d,methods=c('glmpoly','gbm'), replication='sub', test=30, n=2)
m
# randomly drawn species records for test data in the second replication (run) of subsampling:
getReplication(m, replication='sub',run=2)
getReplication(m, replication='sub',run=2,test=F) # drawn record in the training partition
ind <- getReplication(m, replication='sub',run=2,index=T) # index of the selected test record
head(ind)
.df <- as.data.frame(m@data) # convert sdmdata object in the model to data.frame
head(.df)
.df <- .df[.df$rID %in% ind, ] # the full test dataset drawn (second replication)
pr <- predict(m,.df) # predictions of all the methods for the test dataset
pr <- predict(m,.df) # predictions of all the methods for the test dataset
head(pr)
e <- evaluates(.df$sp, pr[,1]) # evaluates for the first method using the selected test data
e@statistics
e@threshold_based
## End(Not run)