outliergram {roahd} | R Documentation |
Outliergram for univariate functional data sets
Description
This function performs the outliergram of a univariate functional data set, possibly with an adjustment of the true positive rate of outliers discovered under assumption of gaussianity.
Usage
outliergram(
fData,
MBD_data = NULL,
MEI_data = NULL,
p_check = 0.05,
Fvalue = 1.5,
adjust = FALSE,
display = TRUE,
xlab = NULL,
ylab = NULL,
main = NULL,
...
)
Arguments
fData |
the univariate functional dataset whose outliergram has to be determined. |
MBD_data |
a vector containing the MBD for each element of the dataset. If missing, MBDs are computed. |
MEI_data |
a vector containing the MEI for each element of the dataset. If not not provided, MEIs are computed. |
p_check |
percentage of observations with either low or high MEI to be checked for outliers in the secondary step (shift towards the center of the dataset). |
Fvalue |
the |
adjust |
either
|
display |
either a logical value indicating whether you want the outliergram to be displayed, or the number of the graphical device where you want the outliergram to be displayed. |
xlab |
a list of two labels to use on the x axis when displaying the functional dataset and the outliergram |
ylab |
a list of two labels to use on the y axis when displaying the functional dataset and the outliergram; |
main |
a list of two titles to be used on the plot of the functional dataset and the outliergram; |
... |
additional graphical parameters to be used only in the plot of the functional dataset |
Value
Even when used graphically to plot the outliergram, the function returns a list containing:
Fvalue
: the value of the parameter F used;d
: the vector of values of the parameterd
for each observation (distance to the parabolic border of the outliergram);ID_outliers
: the vector of observations id corresponding to outliers.
Adjustment
When the adjustment option is selected, the value of F
is optimized for
the univariate functional dataset provided with fData
. In practice,
a number adjust$N_trials
of times a synthetic population
(of size adjust$trial_size
with the same covariance (robustly
estimated from data) and centerline as fData
is simulated without
outliers and each time an optimized value F_i
is computed so that a
given proportion (adjust$TPR
) of observations is flagged as outliers.
The final value of F
for the outliergram is determined as an average
of F_1, F_2, \ldots, F_{N_{trials}}
. At each time step the optimization
problem is solved using stats::uniroot
(Brent's method).
References
Arribas-Gil, A., and Romo, J. (2014). Shape outlier detection and visualization for functional data: the outliergram, Biostatistics, 15(4), 603-619.
See Also
Examples
set.seed(1618)
N <- 200
P <- 200
N_extra <- 4
grid <- seq(0, 1, length.out = P)
Cov <- exp_cov_function(grid, alpha = 0.2, beta = 0.8)
Data <- generate_gauss_fdata(
N = N,
centerline = sin(4 * pi * grid),
Cov = Cov
)
Data_extra <- array(0, dim = c(N_extra, P))
Data_extra[1, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid + pi / 2),
Cov = Cov
)
Data_extra[2, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid - pi / 2),
Cov = Cov
)
Data_extra[3, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid + pi / 3),
Cov = Cov
)
Data_extra[4, ] <- generate_gauss_fdata(
N = 1,
centerline = sin(4 * pi * grid - pi / 3),
Cov = Cov
)
Data <- rbind(Data, Data_extra)
fD <- fData(grid, Data)
# Outliergram with default Fvalue = 1.5
outliergram(fD, display = TRUE)
# Outliergram with Fvalue enforced to 2.5
outliergram(fD, Fvalue = 2.5, display = TRUE)
# Outliergram with estimated Fvalue to ensure TPR of 1%
outliergram(
fData = fD,
adjust = list(
N_trials = 10,
trial_size = 5 * nrow(Data),
TPR = 0.01,
VERBOSE = FALSE
),
display = TRUE
)