HDDM_A {datadriftR}R Documentation

HDDM_A: Drift Detection Method based on Adaptive Windows

Description

This class implements the HDDM_A drift detection method that uses adaptive windows to detect changes in the mean of a data stream. It is designed to monitor online streams of data and can detect increases or decreases in the process mean in a non-parametric and online manner.

Details

HDDM_A adapts to changes in the data stream by adjusting its internal windows to track the minimum and maximum values of the process mean. It triggers alerts when a significant drift from these benchmarks is detected.

Public fields

drift_confidence

Confidence level for detecting a drift.

warning_confidence

Confidence level for warning detection.

two_side_option

Boolean flag for one-sided or two-sided mean monitoring.

total_n

Total number of samples seen.

total_c

Total cumulative sum of the samples.

n_max

Maximum window end for sample count.

c_max

Maximum window end for cumulative sum.

n_min

Minimum window start for sample count.

c_min

Minimum window start for cumulative sum.

n_estimation

Number of samples since the last detected change.

c_estimation

Cumulative sum since the last detected change.

change_detected

Boolean indicating if a change was detected.

warning_detected

Boolean indicating if a warning has been detected.

estimation

Current estimated mean of the stream.

delay

Current delay since the last update.

Methods

Public methods


Method new()

Initializes the HDDM_A detector with specific settings.

Usage
HDDM_A$new(
  drift_confidence = 0.001,
  warning_confidence = 0.005,
  two_side_option = TRUE
)
Arguments
drift_confidence

Confidence level for drift detection.

warning_confidence

Confidence level for issuing warnings.

two_side_option

Whether to monitor both increases and decreases.


Method add_element()

Adds an element to the data stream and updates the detection status.

Usage
HDDM_A$add_element(prediction)
Arguments
prediction

Numeric, the new data value to add.


Method mean_incr()

Calculates if there is an increase in the mean.

Usage
HDDM_A$mean_incr(c_min, n_min, total_c, total_n, confidence)
Arguments
c_min

Minimum cumulative sum.

n_min

Minimum count of samples.

total_c

Total cumulative sum.

total_n

Total number of samples.

confidence

Confidence threshold for detection.


Method mean_decr()

Calculates if there is a decrease in the mean.

Usage
HDDM_A$mean_decr(c_max, n_max, total_c, total_n)
Arguments
c_max

Maximum cumulative sum.

n_max

Maximum count of samples.

total_c

Total cumulative sum.

total_n

Total number of samples.


Method reset()

Resets all internal counters and accumulators to their initial state.

Usage
HDDM_A$reset()

Method update_estimations()

Updates estimations of the mean after detecting changes.

Usage
HDDM_A$update_estimations()

Method clone()

The objects of this class are cloneable with this method.

Usage
HDDM_A$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

References

Frías-Blanco I, del Campo-Ávila J, Ramos-Jimenez G, et al. Online and non-parametric drift detection methods based on Hoeffding’s bounds. IEEE Transactions on Knowledge and Data Engineering, 2014, 27(3): 810-823.

Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer. MOA: Massive Online Analysis; Journal of Machine Learning Research 11: 1601-1604, 2010.

Implementation: https://github.com/scikit-multiflow/scikit-multiflow/blob/a7e316d1cc79988a6df40da35312e00f6c4eabb2/src/skmultiflow/drift_detection/hddm_a.py

Examples

set.seed(123)  # Setting a seed for reproducibility
data_part1 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.7, 0.3))

# Introduce a change in data distribution
data_part2 <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.3, 0.7))

# Combine the two parts
data_stream <- c(data_part1, data_part2)

# Initialize the hddm_a object
hddm_a_instance <- HDDM_A$new()

# Iterate through the data stream
for(i in seq_along(data_stream)) {
  hddm_a_instance$add_element(data_stream[i])
  if(hddm_a_instance$warning_detected) {
    message(paste("Warning detected at index:", i))
  }
  if(hddm_a_instance$change_detected) {
    message(paste("Concept drift detected at index:", i))
  }
}

[Package datadriftR version 0.0.1 Index]