sparseR_prep {sparseR} | R Documentation |
Preprocess & create a model matrix with interactions + polynomials
Description
Preprocess & create a model matrix with interactions + polynomials
Usage
sparseR_prep(
formula,
data,
k = 1,
poly = 1,
pre_proc_opts = c("knnImpute", "scale", "center", "otherbin", "none"),
ia_formula = NULL,
filter = c("nzv", "zv"),
extra_opts = list(),
family = "gaussian"
)
Arguments
formula |
A formula of the main effects + outcome of the model |
data |
A required data frame or tibble containing the variables in
|
k |
Maximum order of interactions to numeric variables |
poly |
the maximum order of polynomials to consider |
pre_proc_opts |
A character vector specifying methods for preprocessing (see details) |
ia_formula |
formula to be passed to step_interact (for interactions, see details) |
filter |
which methods should be used to filter out variables with (near) zero variance? (see details) |
extra_opts |
extra options to be used for preprocessing |
family |
family passed from sparseR |
Details
The pre_proc_opts acts as a wrapper for the corresponding procedures in the
recipes
package. The currently supported options that can be passed to
pre_proc_opts are: knnImpute: Should k-nearest-neighbors be performed (if
necessary?) scale: Should variables be scaled prior to creating interactions
(does not scale factor variables or dummy variables) center: Should variables
be centered (will not center factor variables or dummy variables ) otherbin:
ia_formula
will by default interact all variables with each other up
to order k. If specified, ia_formula will be passed as the terms
argument
to recipes::step_interact
, so the help documentation for that function
can be investigated for further assistance in specifying specific
interactions.
The methods specified in filter are important; filtering is necessary to cut down on extraneous polynomials and interactions (in cases where they really don't make sense). This is true, for instance, when using dummy variables in polynomials , or when using interactions of dummy variables that relate to the same categorical variable.
Value
an object of class recipe
; see recipes::recipe()