KRDetect.outliers.changepoint {envoutliers}  R Documentation 
Identification of outliers in environmental data using method based on kernel smoothing, changepoint analysis of smoothing residuals and subsequent analysis of residuals on homogeneous segments (Campulova et al., 2018).
KRDetect.outliers.changepoint(x, perform.smoothing = TRUE, perform.cp.analysis = TRUE, bandwidth.type = "local", bandwidth.value = NULL, kernel.order = 2, cp.analysis.type = "parametric", pen.value = "5*log(n)", alpha.edivisive = 0.3, min.segment.length = 30, segment.length.for.merge = 15, method = "auto", prefer.grubbs = TRUE, alpha.default = NULL, L.default = NULL)
x 
data values. Supported data types

perform.smoothing 
a logical value specifying if data smoothing is performed. If 
perform.cp.analysis 
a logical value specifying if changepoint analysis is performed. If 
bandwidth.type 
a character string specifying the type of bandwidth. Possible options are

bandwidth.value 
a local bandwidth array (for 
kernel.order 
a nonnegative integer giving the order of the optimal kernel (Gasser et al., 1985) used for smoothing. Possible options are

cp.analysis.type 
a character string specifying the type of changepoint analysis. Possible options are

pen.value 
a character string giving the formula for manual penalty used in PELT algorithm.
Only required for 
alpha.edivisive 
a numeric value giving the moment index used for determining the distance between and within segments in nonparametric changepoint model. Default is 
min.segment.length 
a numeric value giving minimal required number of observations on segments from changepoint analysis.
If a segment contains less than 
segment.length.for.merge 
a numeric value giving minimal required number of observations on segments for performing the homogeneity test within changepoint split control.
A segment with less data than 
method 
a character string specifying the method for identification of outlier residuals. Possible options are

prefer.grubbs 
a logical variable specyfing if Grubbs test for identification of outlier residuals is preferred to quantiles of normal distribution.

alpha.default 
a numeric value from interval (0,1) of alpha parameter determining the criterion for (residual) outlier detection:
the limits for outlier residuals on individual segments are set as +/ (alpha/2quantile of normal distribution with parameters corresponding to residuals on studied segment) * (sample standard deviation of residuals on corresponding segment).
If 
L.default 
a numeric value of L parameter determining the criterion for outlier (residual) detection:
the limits for outlier residuals on individual segments are set as +/ L * sample standard deviation of residuals on corresponding segment.
If 
This function identifies outliers in time series using procedure based on kernel smoothing, changepoint analysis of smoothing residuals and subsequent analysis of residuals on homogeneous segments (Campulova et al., 2018). Three different approaches (Grubbs test, quantiles of normal distribution, Chebyshev inequality), that can be selected automatically based on data structure or specified by the user, can be used to detect outlier residuals. Crucial for the method is the choice of parameters alpha and L for quantiles of normal distribution and Chebyshev inequality approach, that define the criterion for outlier detection. These values can be specified by the user or estimated automatically using data driven algorithms (Campulova et al., 2018).
A "KRDetect"
object which contains a list with elements:
method.type 
a character string giving the type of method used for outlier idetification 
x 
a numeric vector of observations 
index 
a numeric vector of index design points assigned to individual observations 
smoothed 
a numeric vector of estimates of the kernel regression function (smoothed data) 
changepoints 
an integer membership vector for individual segments 
normality.results 
a data.frame of normality results of residuals on individual segments 
detection.method 
a character string giving the type of method used for identification of outlier residuals 
alpha 
a numeric vector of alpha parameters used for outlier identification on individual segments 
L 
a numeric vector of L parameters used for outlier identification on individual segments 
outlier 
a logical vector specyfing the identified outliers, 
Campulova M, Michalek J, Mikuska P, Bokal D (2018). Nonparametric algorithm for identification of outliers in environmental data. Journal of Chemometrics, 32, 453463.
Gasser T, Kneip A, Kohler W (1991). A flexible and fast method for automatic smoothing. Journal of the American Statistical Association, 86, 643–652.
Herrmann E (1997). Local bandwidth choice in kernel regression estimation. Journal of Computational and Graphical Statistics, 6(1), 35–54.
Eva Herrmann; Packaged for R and enhanced by Martin Maechler (2016). lokern: Kernel Regression Smoothing with Local or Global Plugin Bandwidth. R package version 1.18. https://CRAN.Rproject.org/package=lokern.
Killick R, Fearnhead P, Eckley IA (2012). Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association, 107(500), 1590–1598.
Killick R, Haynes K, Eckley IA (2016). changepoint: An R package for changepoint analysis. R package version 2.2.2, <URL: https://CRAN.Rproject.org/package=changepoint>.
Matteson D, James N (2014). A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data. Journal of the American Statistical Association, 109(505), 334–345.
Nicholas A. James, David S. Matteson (2014). ecp: An R Package for Nonparametric Multiple Change Point Analysis of Multivariate Data. Journal of Statistical Software, 62(7), 125, URL "http://www.jstatsoft.org/v62/i07/".
Brys G, Hubert M, Struyf A (2008). Goodnessoffit tests based on a robust measure of skewness. Computational Statistics, 23(3), 429–442.
Todorov V, Filzmoser P (2009). An ObjectOriented Framework for Robust Multivariate Analysis. Journal of Statistical Software, 32(3), 147. URL http://www.jstatsoft.org/v32/i03/.
Box G, Cox D (1964). An analysis of transformations. Journal of the Royal Statistical Society: Series B, 26, 211–234.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S. New York, fourth edition. ISBN 0387954570, URL http://www.stats.ox.ac.uk/pub/MASS4.
Grubbs F (1950). Sample criteria for testing outlying observations. The Annals of Mathematical Statistics, 21(1), 2758.
Fox J (2016). Applied regression analysis and generalized linear models. 3 edition. Los Angeles: SAGE. ISBN 9781452205663.
data("mydata", package = "openair") x = mydata$o3[format(mydata$date, "%m %Y") == "12 2002"] result = KRDetect.outliers.changepoint(x) summary(result) plot(result) plot(result, show.segments = FALSE)