gen_glob_outl {SpatialBSS} | R Documentation |
Contamination with Global Outliers
Description
Generates synthetic global outliers and contaminates a given p-variate random field
Usage
gen_glob_outl(x, alpha = 0.05, h = 10, random_sign = FALSE)
Arguments
x |
a numeric matrix of dimension |
alpha |
a numerical value between 0 and 1 giving the proportion of observations to contaminate. |
h |
a numerical constant to determine how large the contaminated outliers are, see details. |
random_sign |
logical. If |
Details
gen_glob_outl
generates outliers for a given field by selecting randomly round(alpha * n)
observations to be the outliers and contaminating them by setting
, where the elements
of vector
are determined by the parameter
random_sign
. If random_sign = TRUE
, is either
or
with
. If
random_sign = FALSE
, for all
,
. The parameter
alpha
determines the contamination rate and the parameter
h
determines the size of the outliers.
Value
gen_glob_outl
returns a data.frame
containing the contaminated fields as first columns. The column
contains a logical indicator whether the observation is outlier or not.
See Also
Examples
# simulate coordinates
coords <- runif(1000 * 2) * 20
dim(coords) <- c(1000, 2)
coords_df <- as.data.frame(coords)
names(coords_df) <- c("x", "y")
# simulate random field
if (!requireNamespace('gstat', quietly = TRUE)) {
message('Please install the package gstat to run the example code.')
} else {
library(gstat)
model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0,
model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20)
model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0,
model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'),
nmax = 20)
model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0,
model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20)
field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1
field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1
field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1
field <- cbind(field_1, field_2, field_3)
# Generate 10 % global outliers to data, with size h=15.
field_cont <- gen_glob_outl(field, alpha = 0.1, h = 15)
# Generate 5 % global outliers to data, with size h = 10 and random sign.
field_cont2 <- gen_glob_outl(field, alpha = 0.05, h = 10, random_sign = TRUE)
}