concat.split {splitstackshape} | R Documentation |
Split Concatenated Cells in a Dataset
Description
The concat.split
function takes a column with multiple values, splits
the values into a list
or into separate columns, and returns a new
data.frame
or data.table
.
Usage
concat.split(data, split.col, sep = ",", structure = "compact",
mode = NULL, type = NULL, drop = FALSE, fixed = FALSE,
fill = NA, ...)
Arguments
data |
The source |
split.col |
The variable that needs to be split; can be specified either by the column number or the variable name. |
sep |
The character separating each value (defaults to |
structure |
Can be either |
mode |
Can be either |
type |
Can be either |
drop |
Logical (whether to remove the original variable from the output
or not). Defaults to |
fixed |
Is the input for the |
fill |
The "fill" value for missing values when |
... |
Additional arguments to |
Details
structure
-
"compact"
creates as many columns as the maximum length of the resulting split. This is the most useful general-case application of this function. When the input is numeric,
"expanded"
creates as many columns as the maximum value of the input data. This is most useful when converting tomode = "binary"
.-
"list"
creates a single new column that is structurally alist
within adata.frame
ordata.table
.
fixed
When
structure = "expanded"
orstructure = "list"
, it is possible to supply a a regular expression containing the characters to split on. For example, to split on","
,";"
, or"|"
, you can setsep = ",|;|\|"
orsep = "[,;|]"
, andfixed = FALSE
to split on any of those characters.
Note
This is more of a "legacy" or "convenience" wrapper function encompassing
the features available in the separated functions of cSplit()
, cSplit_l()
,
and cSplit_e()
.
Author(s)
Ananda Mahto
See Also
cSplit()
, cSplit_l()
, cSplit_e()
Examples
## Load some data
temp <- head(concat.test)
# Split up the second column, selecting by column number
concat.split(temp, 2)
# ... or by name, and drop the offensive first column
concat.split(temp, "Likes", drop = TRUE)
# The "Hates" column uses a different separator
concat.split(temp, "Hates", sep = ";", drop = TRUE)
## Not run:
# You'll get a warning here, when trying to retain the original values
concat.split(temp, 2, mode = "value", drop = TRUE)
## End(Not run)
# Try again. Notice the differing number of resulting columns
concat.split(temp, 2, structure = "expanded",
mode = "value", type = "numeric", drop = TRUE)
# Let's try splitting some strings... Same syntax
concat.split(temp, 3, drop = TRUE)
# Strings can also be split to binary representations
concat.split(temp, 3, structure = "expanded",
type = "character", fill = 0, drop = TRUE)
# Split up the "Likes column" into a list variable; retain original column
head(concat.split(concat.test, 2, structure = "list", drop = FALSE))
# View the structure of the output to verify
# that the new column is a list; note the
# difference between "Likes" and "Likes_list".
str(concat.split(temp, 2, structure = "list", drop = FALSE))