lookout {weird} | R Documentation |
Lookout probabilities
Description
Compute leave-one-out log score probabilities using a Generalized Pareto distribution. These give the probability of each observation being an anomaly.
Usage
lookout(
object = NULL,
density_scores = NULL,
loo_scores = density_scores,
threshold_probability = 0.95
)
Arguments
object |
A model object or a numerical data set. |
density_scores |
Numerical vector of log scores |
loo_scores |
Optional numerical vector of leave-one-out log scores |
threshold_probability |
Probability threshold when computing the POT model for the log scores. |
Details
This function can work with several object types.
If object
is not NULL
, then the object is passed to density_scores
to compute density scores (and possibly LOO density scores). Otherwise,
the density scores are taken from the density_scores
argument, and the
LOO density scores are taken from the loo_scores
argument. Then the Generalized
Pareto distribution is fitted to the scores, to obtain the probability of each observation.
Value
A numerical vector containing the lookout probabilities
Author(s)
Rob J Hyndman
References
Sevvandi Kandanaarachchi & Rob J Hyndman (2022) "Leave-one-out kernel density estimates for outlier detection", J Computational & Graphical Statistics, 31(2), 586-599. https://robjhyndman.com/publications/lookout/
Examples
# Univariate data
tibble(
y = c(5, rnorm(49)),
lookout = lookout(y)
)
# Bivariate data with score calculation done outside the function
tibble(
x = rnorm(50),
y = c(5, rnorm(49)),
fscores = density_scores(y),
loo_fscores = density_scores(y, loo = TRUE),
lookout = lookout(density_scores = fscores, loo_scores = loo_fscores)
)
# Using a regression model
of <- oldfaithful |> filter(duration < 7200, waiting < 7200)
fit_of <- lm(waiting ~ duration, data = of)
of |>
mutate(lookout_prob = lookout(fit_of)) |>
arrange(lookout_prob)