NearZeroVariance {LOGANTree}R Documentation

Flag the features that have (near) zero variance

Description

Flag the features that have (near) zero variance

Usage

NearZeroVariance(data)

Arguments

data

A dataset containing the study’s features.

Value

This function returns a dataframe with feature names and their frequency ratio, percentage of the unique value and logic values indicating whether the feature is zero variance or has near zero variance.

feature : name of the features.

flag.zv (Flag Zero Variance) : True/False, flagging zero variance.

fr (Frequency Ratio) : the ratio of the value with the highest frequency over the value with the second highest frequency.

puv (Percentage of Unique Values) : number of the unique values divided by the total number of samples.

flag.nzv (Flag Near Zero Variance) : True/False, flagging near zero variance.

References

Boehmke, B., & Greenwell, B. M. (2019). Hands-on machine learning with R. CRC Press.p.52-55. https://doi-org.ezproxy.uio.no/10.1201/9780367816377

Examples

NearZeroVariance(training)

[Package LOGANTree version 0.1.1 Index]