fast_filter_variables {dataPreparation} | R Documentation |
Filtering useless variables
Description
Delete columns that are constant or in double in your data_set set.
Usage
fast_filter_variables(
data_set,
level = 3,
keep_cols = NULL,
verbose = TRUE,
...
)
Arguments
data_set |
Matrix, data.frame or data.table |
level |
which columns do you want to filter (1 = constant, 2 = constant and doubles, 3 = constant doubles and bijections, 4 = constant doubles bijections and included)(numeric, default to 3) |
keep_cols |
List of columns not to drop (list of character, default to NULL) |
verbose |
Should the algorithm talk (logical or 1 or 2, default to TRUE) |
... |
optional parameters to be passed to the function when called from another function |
Details
verbose
can be set to 2 have full details from which functions, otherwise they
don't log. (verbose = 1
is equivalent to verbose = TRUE
).
Value
The same data_set but with fewer columns. Columns that are constant, in double, or bijection of another have been deleted.
Examples
# First let's build a data.frame with 3 columns: a constant column, and a column in double
## Not run:
df <- data.frame(col1 = 1, col2 = rnorm(1e6), col3 = sample(c(1, 2), 1e6, replace = TRUE))
df$col4 <- df$col2
df$col5[df$col3 == 1] = "a"
df$col5[df$col3 == 2] = "b" # Same info than in col1 but with a for 1 and b for 2
head(df)
# Let's filter columns:
df <- fast_filter_variables(df)
head(df)
## End(Not run)
# Don't run for CRAN, you can run example