encode_median {categoryEncodings} | R Documentation |
Encode a given factor variable using median encoding
Description
Transforms the original design matrix using a median encoding.
Usage
encode_median(X, fact, keep_factor = FALSE, encoding_only = FALSE)
Arguments
X |
The data.frame/data.table to transform. |
fact |
The factor variable to encode by - either a positive integer specifying the column number, or the name of the column. |
keep_factor |
Whether to keep the original factor column(defaults to **FALSE**). |
encoding_only |
Whether to return the full transformed dataset or only the new columns. Defaults to FALSE and returns the full dataset. |
Details
This might be somewhat lacking in theory (to the author's best knowledge), but feel free to try it and publish the results if they turn out interesting on some particular problem.
Value
A new data.table X which contains the new columns and optionally the old factor.
Examples
design_mat <- cbind( data.frame( matrix(rnorm(5*100),ncol = 5) ),
sample( sample(letters, 10), 100, replace = TRUE)
)
colnames(design_mat)[6] <- "factor_var"
encode_median(X = design_mat, fact = "factor_var", keep_factor = FALSE)
[Package categoryEncodings version 1.4.3 Index]