sparsify {mltools} | R Documentation |
Sparsify
Description
Convert a data.table object into a sparse matrix (with the same number of rows).
Usage
sparsify(dt, sparsifyNAs = FALSE, naCols = "none")
Arguments
dt |
A data.table object |
sparsifyNAs |
Should NAs be converted to 0s and sparsified? |
naCols |
|
Details
Converts a data.table object to a sparse matrix (class "dgCMatrix"). Requires the Matrix package. All sparsified data is assumed to take on the value 0/FALSE
### Data Type | Description & NA handling
numeric | If sparsifyNAs
= FALSE, only 0s will be sparsified
If sparsifyNAs
= TRUE, 0s and NAs will be sparsified
factor (unordered) | Each level will generate a sparsified binary column Column names are feature_level, e.g. "color_red", "color_blue"
factor (ordered) | Levels are converted to numeric, 1 - NLevels
If sparsifyNAs
= FALSE, NAs will remain as NAs
If sparsifyNAs
= TRUE, NAs will be sparsified
logical | TRUE and FALSE values will be converted to 1s and 0s
If sparsifyNAs
= FALSE, only FALSEs will be sparsified
If sparsifyNAs
= TRUE, FALSEs and NAs will be sparsified
Examples
library(data.table)
library(Matrix)
dt <- data.table(
intCol=c(1L, NA_integer_, 3L, 0L),
realCol=c(NA, 2, NA, NA),
logCol=c(TRUE, FALSE, TRUE, FALSE),
ofCol=factor(c("a", "b", NA, "b"), levels=c("a", "b", "c"), ordered=TRUE),
ufCol=factor(c("a", NA, "c", "b"), ordered=FALSE)
)
sparsify(dt)
sparsify(dt, sparsifyNAs=TRUE)
sparsify(dt[, list(realCol)], naCols="identify")
sparsify(dt[, list(realCol)], naCols="efficient")