| concat.split {splitstackshape} | R Documentation |
Split Concatenated Cells in a Dataset
Description
The concat.split function takes a column with multiple values, splits
the values into a list or into separate columns, and returns a new
data.frame or data.table.
Usage
concat.split(data, split.col, sep = ",", structure = "compact",
mode = NULL, type = NULL, drop = FALSE, fixed = FALSE,
fill = NA, ...)
Arguments
data |
The source |
split.col |
The variable that needs to be split; can be specified either by the column number or the variable name. |
sep |
The character separating each value (defaults to |
structure |
Can be either |
mode |
Can be either |
type |
Can be either |
drop |
Logical (whether to remove the original variable from the output
or not). Defaults to |
fixed |
Is the input for the |
fill |
The "fill" value for missing values when |
... |
Additional arguments to |
Details
structure
-
"compact"creates as many columns as the maximum length of the resulting split. This is the most useful general-case application of this function. When the input is numeric,
"expanded"creates as many columns as the maximum value of the input data. This is most useful when converting tomode = "binary".-
"list"creates a single new column that is structurally alistwithin adata.frameordata.table.
fixed
When
structure = "expanded"orstructure = "list", it is possible to supply a a regular expression containing the characters to split on. For example, to split on",",";", or"|", you can setsep = ",|;|\|"orsep = "[,;|]", andfixed = FALSEto split on any of those characters.
Note
This is more of a "legacy" or "convenience" wrapper function encompassing
the features available in the separated functions of cSplit(), cSplit_l(),
and cSplit_e().
Author(s)
Ananda Mahto
See Also
cSplit(), cSplit_l(), cSplit_e()
Examples
## Load some data
temp <- head(concat.test)
# Split up the second column, selecting by column number
concat.split(temp, 2)
# ... or by name, and drop the offensive first column
concat.split(temp, "Likes", drop = TRUE)
# The "Hates" column uses a different separator
concat.split(temp, "Hates", sep = ";", drop = TRUE)
## Not run:
# You'll get a warning here, when trying to retain the original values
concat.split(temp, 2, mode = "value", drop = TRUE)
## End(Not run)
# Try again. Notice the differing number of resulting columns
concat.split(temp, 2, structure = "expanded",
mode = "value", type = "numeric", drop = TRUE)
# Let's try splitting some strings... Same syntax
concat.split(temp, 3, drop = TRUE)
# Strings can also be split to binary representations
concat.split(temp, 3, structure = "expanded",
type = "character", fill = 0, drop = TRUE)
# Split up the "Likes column" into a list variable; retain original column
head(concat.split(concat.test, 2, structure = "list", drop = FALSE))
# View the structure of the output to verify
# that the new column is a list; note the
# difference between "Likes" and "Likes_list".
str(concat.split(temp, 2, structure = "list", drop = FALSE))