split {base} | R Documentation |

## Divide into Groups and Reassemble

### Description

`split`

divides the data in the vector `x`

into the groups
defined by `f`

. The replacement forms replace values
corresponding to such a division. `unsplit`

reverses the effect of
`split`

.

### Usage

```
split(x, f, drop = FALSE, ...)
## Default S3 method:
split(x, f, drop = FALSE, sep = ".", lex.order = FALSE, ...)
split(x, f, drop = FALSE, ...) <- value
unsplit(value, f, drop = FALSE)
```

### Arguments

`x` |
vector or data frame containing values to be divided into groups. |

`f` |
a ‘factor’ in the sense that |

`drop` |
logical indicating if levels that do not occur should be dropped
(if |

`value` |
a list of vectors or data frames compatible with a
splitting of |

`sep` |
character string, passed to |

`lex.order` |
logical, passed to |

`...` |
further potential arguments passed to methods. |

### Details

`split`

and `split<-`

are generic functions with default and
`data.frame`

methods. The data frame method can also be used to
split a matrix into a list of matrices, and the replacement form
likewise, provided they are invoked explicitly.

`unsplit`

works with lists of vectors or data frames (assumed to
have compatible structure, as if created by `split`

). It puts
elements or rows back in the positions given by `f`

. In the data
frame case, row names are obtained by unsplitting the row name
vectors from the elements of `value`

.

`f`

is recycled as necessary and if the length of `x`

is not
a multiple of the length of `f`

a warning is printed.

Any missing values in `f`

are dropped together with the
corresponding values of `x`

.

The default method calls `interaction`

when `f`

is a
`list`

. If the levels of the factors contain ‘.’
the factors may not be split as expected, unless `sep`

is set to
string not present in the factor `levels`

.

### Value

The value returned from `split`

is a list of vectors containing
the values for the groups. The components of the list are named by
the levels of `f`

(after converting to a factor, or if already a
factor and `drop = TRUE`

, dropping unused levels).

The replacement forms return their right hand side. `unsplit`

returns a vector or data frame for which `split(x, f)`

equals
`value`

### References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
*The New S Language*.
Wadsworth & Brooks/Cole.

### See Also

`cut`

to categorize numeric values.

`strsplit`

to split strings.

### Examples

```
require(stats); require(graphics)
n <- 10; nn <- 100
g <- factor(round(n * runif(n * nn)))
x <- rnorm(n * nn) + sqrt(as.numeric(g))
xg <- split(x, g)
boxplot(xg, col = "lavender", notch = TRUE, varwidth = TRUE)
sapply(xg, length)
sapply(xg, mean)
### Calculate 'z-scores' by group (standardize to mean zero, variance one)
z <- unsplit(lapply(split(x, g), scale), g)
# or
zz <- x
split(zz, g) <- lapply(split(x, g), scale)
# and check that the within-group std dev is indeed one
tapply(z, g, sd)
tapply(zz, g, sd)
### data frame variation
## Notice that assignment form is not used since a variable is being added
g <- airquality$Month
l <- split(airquality, g)
## Alternative using a formula
identical(l, split(airquality, ~ Month))
l <- lapply(l, transform, Oz.Z = scale(Ozone))
aq2 <- unsplit(l, g)
head(aq2)
with(aq2, tapply(Oz.Z, Month, sd, na.rm = TRUE))
### Split a matrix into a list by columns
ma <- cbind(x = 1:10, y = (-4:5)^2)
split(ma, col(ma))
split(1:10, 1:2)
```

*base*version 4.4.0 Index]