splitfactor {cobalt} | R Documentation |
Split and Unsplit Factors into Dummy Variables
Description
splitfactor()
splits factor variables into dummy (0/1) variables. This can be useful when functions do not process factor variables well or require numeric matrices to operate. unsplitfactor()
combines dummy variables into factor variables, undoing the operation of splitfactor()
.
Usage
splitfactor(
data,
var.name,
drop.level = NULL,
drop.first = TRUE,
drop.singleton = FALSE,
drop.na = TRUE,
sep = "_",
replace = TRUE,
split.with = NULL,
check = TRUE
)
unsplitfactor(
data,
var.name,
dropped.level = NULL,
dropped.na = TRUE,
sep = "_",
replace = TRUE
)
Arguments
data |
A |
var.name |
For |
drop.level |
The name of a level of |
drop.first |
Whether to drop the first dummy created for each factor. If |
drop.singleton |
Whether to drop a factor variable if it only has one level. |
drop.na |
If |
sep |
A character separating the the stem from the value of the variable for each dummy. For example, for |
replace |
Whether to replace the original variable(s) with the new variable(s) ( |
split.with |
A list of vectors or factors with lengths equal to the number of columns of |
check |
Whether to make sure the variables specified in |
dropped.level |
The value of each original factor variable whose dummy was dropped when the variable was split. If left empty and a dummy was dropped, the resulting factor will have the value |
dropped.na |
If |
Details
If there are NA
s in the variable to be split, the new variables created by splitfactor()
will have NA
where the original variable is NA
.
When using unsplitfactor()
on a data.frame
that was generated with splitfactor()
, the arguments dropped.na
, and sep
are unnecessary.
If split.with
is supplied, the elements will be split in the same way data
is. For example, if data
contained a 4-level factor that was to be split, the entries of split.with
at the same index as the factor and would be duplicated so that resulting entries will have the same length as the number of columns of data
after being split. The resulting values are stored in the "split.with"
attribute of the output object. See Examples.
Value
For splitfactor()
, a data.frame
containing the original data set with the newly created dummies. For unsplitfactor()
. a data.frame
containing the original data set with the newly created factor variables.
See Also
Examples
data("lalonde", package = "cobalt")
lalonde.split <- splitfactor(lalonde, "race",
replace = TRUE,
drop.first = TRUE)
# A data set with "race_hispan" and "race_white" instead
# of "race".
lalonde.unsplit <- unsplitfactor(lalonde.split, "race",
replace = TRUE,
dropped.level = "black")
all.equal(lalonde, lalonde.unsplit) #TRUE
# Demonstrating the use of split.with:
to.split <- list(letters[1:ncol(lalonde)],
1:ncol(lalonde))
lalonde.split <- splitfactor(lalonde, split.with = to.split,
drop.first = FALSE)
attr(lalonde.split, "split.with")