| KSWIN {datadriftR} | R Documentation |
KSWIN (Kolmogorov-Smirnov WINdowing) for Change Detection
Description
Implements the Kolmogorov-Smirnov test for detecting distribution changes within a window of streaming data. KSWIN is a non-parametric method for change detection that compares two samples to determine if they come from the same distribution.
Details
KSWIN is effective for detecting changes in the underlying distribution of data streams. It is particularly useful in scenarios where data properties may evolve over time, allowing for early detection of changes that might affect subsequent data processing.
Public fields
alphaSignificance level for the KS test.
window_sizeTotal size of the data window used for testing.
stat_sizeNumber of data points sampled from the window for the KS test.
windowCurrent data window used for change detection.
change_detectedBoolean flag indicating whether a change has been detected.
p_valueP-value of the most recent KS test.
Methods
Public methods
Method new()
Initializes the KSWIN detector with specific settings.
Usage
KSWIN$new(alpha = 0.005, window_size = 100, stat_size = 30, data = NULL)
Arguments
alphaThe significance level for the KS test.
window_sizeThe size of the data window for change detection.
stat_sizeThe number of samples in the statistical test window.
dataInitial data to populate the window, if provided.
Method reset()
Resets the internal state of the detector to its initial conditions.
Usage
KSWIN$reset()
Method add_element()
Adds a new element to the data window and updates the detection status based on the KS test.
Usage
KSWIN$add_element(x)
Arguments
xThe new data value to add to the window.
Method detected_change()
Checks if a change has been detected based on the most recent KS test.
Usage
KSWIN$detected_change()
Returns
Boolean indicating whether a change was detected.
Method clone()
The objects of this class are cloneable with this method.
Usage
KSWIN$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
References
Christoph Raab, Moritz Heusinger, Frank-Michael Schleif, Reactive Soft Prototype Computing for Concept Drift Streams, Neurocomputing, 2020.
Implementation: https://github.com/scikit-multiflow/scikit-multiflow/blob/a7e316d1cc79988a6df40da35312e00f6c4eabb2/src/skmultiflow/drift_detection/kswin.py
Examples
set.seed(123) # Setting a seed for reproducibility
data_part1 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.7, 0.3))
# Introduce a change in data distribution
data_part2 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.3, 0.7))
# Combine the two parts
data_stream <- c(data_part1, data_part2)