remove_empty_features {SmartMeterAnalytics} | R Documentation |
Removes variables with no necessary information from a data.frame
Description
Removes variable names from a list of variables that contain only, or a large portion of, NA values or have zero bandwidth (if they are numeric) and returns the variable names.
Usage
remove_empty_features(
all.features,
dataset,
percentage_NA_allowed = NA,
bandwidth = (.Machine$double.eps^0.5),
verbose = FALSE
)
Arguments
all.features |
a character vector with all column names of |
dataset |
the dataset as a data.frame |
percentage_NA_allowed |
the percentage of missing values per vector that should be allowed without removing the feature. All features with NA values that are higher than this level are excluded. |
bandwidth |
The length of the interval that values of variable must exceed to be not
removed. By default, half of |
verbose |
boolean if debug messages should be printed when a variable is removed from the list (uses futile.logger package) |
Details
The function checks all given column names for the portion of NA values.
If the number of NA of Inf exceeds percentage_NA_allowed
,
the column name is removed from the variable set. Besides, all numeric
variables are checked if they have almost zero bandwidth
, are removed.
Value
a vector of variable names that are not considered as empty
Author(s)
Konstantin Hopf konstantin.hopf@uni-bamberg.de
See Also
naInf_omit, replaceNAsFeatures