R: evaluate for accuracy

evaluates {sdm}

R Documentation

evaluate for accuracy

Description

evaluates for accuracy

Usage

evaluates(x,p,...)

getEvaluation(x,id,wtest,stat,opt,...)

getReplication(x,id,replication,species,run,index,test)

Arguments

`x`	a numeric vector or a `sdmdata` object including the observed values; a `sdmModels` object in `getEvaluation`
`p`	a numeric vector or a `RasterLayer` including the predicted values
`id`	a single numeric value indicates the modelID
`wtest`	which test data should be used: "training", "test.dep", or "test.indep"?
`stat`	statistics that should be extracted from the `sdmEvaluate` object
`opt`	a numeric value indicates which threshold optimisation criteria should be considered if a threshold-based statistic is selected in stat
`species`	optional; a character vector specifies the name of species for which the replication is returned (default is NULL)
`replication`	a character specifies the name of the replication method
`run`	a single numeric value specifies the replication ID
`index`	logical (default: FALSE); specifies whether the index or species data of drawn records should be returned
`test`	logical (default: TRUE); specifies whether the test partition should be returned or training partition
`...`	additional arguments (see details)

Details

Evaluates the preformance (accuracy) given the obsetved values, and the predicted values. As additional argument, the distribution of data can be specified (through distribution), that can be either of 'binomial', 'gaussian', 'laplase', or 'poisson'. If not specified, it will be guessed by the function!

getEvaluation can be used to get the evaluation results from a fitted model (sdmModels object that is output of the sdm function). Each model in sdmModels has a modelID, that can be specified in w argument. If w is not specified or more than a modelID is specified, then a data.frame is generated that contains the statistics specified in stat. For a single model (if length w is 1), stat can be 1 (threhold_independent statistics), or 2 (threshold_based statistics) or NULL (both groups). If more than a model is specified (w is either NULL or has a length greater than 1), stat can be the name of statistics such as 'AUC', 'COR', 'Deviance', 'obs.prevalence', 'threshold', 'sensitivity', 'specificity', 'TSS','MCC', 'Kappa', 'NMI', 'phi', 'ppv', 'npv', 'ccr', 'prevalence'. If either of the thershold_based stats are selected, opt can be also specified to select one of the criteria for optimising the threshold. The possible value can be between 1 to 15 for "sp=se", "max(se+sp)", "min(cost)", "minROCdist", "max(kappa)", "max(ppv+npv)", "ppv=npv", "max(NMI)", "max(ccr)", "prevalence", "max(MCC)", "P10", "P5", "P1", "P0" criteria, respectively. P10, P5, P1 refer to 10, 5, and 1 percentile of presence records in the evaluation dataset, respectively for which the suitability value is used as the threshold. By choosing P0, the minimum suitability value across presence records is selected as the threshold.

getReplication returns portion of records randomly selected through data partitioning using one of the replication methods (e.g., 'cv', 'boot', 'sub').

Value

an object of class sdmEvaluate from evaluates function

a list or data.frame from getEvaluation function

Author(s)

Babak Naimi naimi.b@gmail.com

https://www.r-gis.net/

https://www.biogeoinformatics.org/

References

Naimi, B., Araujo, M.B. (2016) sdm: a reproducible and extensible R platform for species distribution modelling, Ecography, DOI: 10.1111/ecog.01881

Examples

## Not run: 
file <- system.file("external/model.sdm", package="sdm")

m <- read.sdm(file) # a sdmModels Object (fitted using sdm function)

getModelInfo(m)

# there are 4 models in the sdmModels objects

# so let's take a look  at all the results for the model with modelID 1

# evaluation using training data (both threshod_independent and threshold_based groups):

getEvaluation(m,w=1,wtest='training') 

getEvaluation(m,w=1,wtest='training',stat=1) # stat=1 (threshold_independent)

getEvaluation(m,w=1,wtest='test.dep',stat=2) # stat=2 (threshold_based)

getEvaluation(m,w=1:3,wtest='test.dep',stat=c('AUC','TSS'),opt=2) 

getEvaluation(m,opt=1) # all models

getEvaluation(m,stat=c('TSS','Kappa','AUC'),opt=1) # all models


############

#example for evaluation:

evaluates(x=c(1,1,0,1,0,0,0,1,1,1,0),
          p=c(0.69,0.04,0.05,0.95,0.04,0.65,0.09,0.61,0.75,0.84,0.15))

##############
# Example for getReplication:


df <- read.csv(file) # load a csv file

head(df)

d <- sdmData(sp~b15+NDVI,train=df) # sdmdata object

d
#----
# fit SDMs using 2 methods and a subsampling replication method with 2 replications:

m <- sdm(sp~b15+NDVI,data=d,methods=c('glmpoly','gbm'), replication='sub', test=30, n=2)

m


# randomly drawn species records for test data in the second replication (run) of subsampling:
getReplication(m, replication='sub',run=2) 

getReplication(m, replication='sub',run=2,test=F) # drawn record in the training partition

ind <- getReplication(m, replication='sub',run=2,index=T) # index of the selected test record 

head(ind)

.df <- as.data.frame(m@data) # convert sdmdata object in the model to data.frame

head(.df)

.df <- .df[.df$rID %in% ind, ] # the full test dataset drawn (second replication)

pr <- predict(m,.df) # predictions of all the methods for the test dataset 



pr <- predict(m,.df) # predictions of all the methods for the test dataset 


head(pr) 

e <- evaluates(.df$sp, pr[,1]) # evaluates for the first method using the selected test data

e@statistics

e@threshold_based



## End(Not run)

[Package sdm version 1.2-46 Index]