split_var {sjmisc} | R Documentation |
Split numeric variables into smaller groups
Description
Recode numeric variables into equal sized groups, i.e. a
variable is cut into a smaller number of groups at specific cut points.
split_var_if()
is a scoped variant of split_var()
, where
transformation will be applied only to those variables that match the
logical condition of predicate
.
Usage
split_var(
x,
...,
n,
as.num = FALSE,
val.labels = NULL,
var.label = NULL,
inclusive = FALSE,
append = TRUE,
suffix = "_g"
)
split_var_if(
x,
predicate,
n,
as.num = FALSE,
val.labels = NULL,
var.label = NULL,
inclusive = FALSE,
append = TRUE,
suffix = "_g"
)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
n |
The new number of groups that |
as.num |
Logical, if |
val.labels |
Optional character vector, to set value label attributes
of recoded variable (see vignette Labelled Data and the sjlabelled-Package).
If |
var.label |
Optional string, to set variable label attribute for the
returned variable (see vignette Labelled Data and the sjlabelled-Package).
If |
inclusive |
Logical; if |
append |
Logical, if |
suffix |
Indicates which suffix will be added to each dummy variable.
Use |
predicate |
A predicate function to be applied to the columns. The
variables for which |
Details
split_var()
splits a variable into equal sized groups, where
the amount of groups depends on the n
-argument. Thus, this
functions cuts
a variable into groups at the specified
quantiles
.
By contrast, group_var
recodes a variable into groups, where
groups have the same value range (e.g., from 1-5, 6-10, 11-15 etc.).
split_var()
also works on grouped data frames
(see group_by
). In this case, splitting is applied to
the subsets of variables in x
. See 'Examples'.
Value
A grouped variable with equal sized groups. If x
is a data frame,
for append = TRUE
, x
including the grouped variables as new
columns is returned; if append = FALSE
, only the grouped variables
will be returned. If append = TRUE
and suffix = ""
,
recoded variables will replace (overwrite) existing variables.
Note
In case a vector has only few number of unique values, splitting into
equal sized groups may fail. In this case, use the inclusive
-argument
to shift a value at the cut point into the lower, preceeding group to
get equal sized groups. See 'Examples'.
See Also
group_var
to group variables into equal ranged groups,
or rec
to recode variables.
Examples
data(efc)
# non-grouped
table(efc$neg_c_7)
# split into 3 groups
table(split_var(efc$neg_c_7, n = 3))
# split multiple variables into 3 groups
split_var(efc, neg_c_7, pos_v_4, e17age, n = 3, append = FALSE)
frq(split_var(efc, neg_c_7, pos_v_4, e17age, n = 3, append = FALSE))
# original
table(efc$e42dep)
# two groups, non-inclusive cut-point
# vector split leads to unequal group sizes
table(split_var(efc$e42dep, n = 2))
# two groups, inclusive cut-point
# group sizes are equal
table(split_var(efc$e42dep, n = 2, inclusive = TRUE))
# Unlike dplyr's ntile(), split_var() never splits a value
# into two different categories, i.e. you always get a clean
# separation of original categories
library(dplyr)
x <- dplyr::ntile(efc$neg_c_7, n = 3)
table(efc$neg_c_7, x)
x <- split_var(efc$neg_c_7, n = 3)
table(efc$neg_c_7, x)
# works also with gouped data frames
mtcars %>%
split_var(disp, n = 3, append = FALSE) %>%
table()
mtcars %>%
group_by(cyl) %>%
split_var(disp, n = 3, append = FALSE) %>%
table()