outliers.detect {gecko} | R Documentation |
Detect outliers in a set of geographical coordinates
Description
This function generates pseudo-abscences from an input data.frame containing latitude and longitude coordinates by using environmental data and then uses both presences and pseudo-absences to train a SVM model used to flag possible outliers for a given species.
Usage
outliers.detect(
longlat,
training = NULL,
hi_res = TRUE,
crop = FALSE,
threshold = 0.05,
method = "all"
)
Arguments
longlat |
data.frame. With two columns containing latitude and longitude, describing the locations of a species, which may contain outliers. |
training |
data.frame. With the same formatting as |
hi_res |
logical. Specifies if 1 KM resolution environmental data should be used.
If |
crop |
logical. Indicates whether environmental data should be cropped to
an extent similar to what is given in |
threshold |
numeric. Value indicating the threshold for classifying
outliers in methods |
method |
A string specifying the outlier detection method. |
Details
Environmental data used is WorldClim and requires a long download, see
gecko::gecko.setDir()
This function is heavily based on the methods described in Liu et al. (2017).
There the authors describe SVM_pdSDM, a pseudo-SDM method similar to a
two-class presence only SVM that is capable of using pseudo-absence points,
implemented with the ksvm function in the R package kernlab.
It is suggested that, for each set of "n"
occurence
records, "2 * n"
pseudo-absences points are generated.
Whilst using it keep in mind works highlighting limitations such as such as
Meynard et al. (2019). See References section.
Value
list if method = "all"
, containing whether or not a given point
was classified as TRUE
or FALSE
along with the confusion matrix
for the training data. If method = "geo"
or
method = "env"
a data.frame is returned.
References
Liu, C., White, M. and Newell, G. (2017) ‘Detecting outliers in species distribution data’, Journal of Biogeography, 45(1), pp. 164–176. doi:10.1111/jbi.13122.
Meynard, C.N., Kaplan, D.M. and Leroy, B. (2019) ‘Detecting outliers in species distribution data: Some caveats and clarifications on a virtual species study’, Journal of Biogeography, 46(9), pp. 2141–2144. doi:10.1111/jbi.13626.
Examples
## Not run:
new_occurences = gecko.data("records")
old_occurences = data.frame(X = runif(10, -17.1, -17.05), Y = runif(10, 32.73, 32.76))
outliers.detect(new_occurences, old_occurences)
## End(Not run)