by {base}R Documentation

Apply a Function to a Data Frame Split by Factors


Function by is an object-oriented wrapper for tapply applied to data frames.


by(data, INDICES, FUN, ..., simplify = TRUE)



an R object, normally a data frame, possibly a matrix.


a factor or a list of factors, each of length nrow(data). For the data frame method, INDICES can also be a formula as in the f argument of the split method for data frames.


a function to be applied to (usually data-frame) subsets of data.


further arguments to FUN.


logical: see tapply.


A data frame is split by row into data frames subsetted by the values of one or more factors, and function FUN is applied to each subset in turn.

For the default method, an object with dimensions (e.g., a matrix) is coerced to a data frame and the data frame method applied. Other objects are also coerced to a data frame, but FUN is applied separately to (subsets of) each column of the data frame.


An object of class "by", giving the results for each subset. This is always a list if simplify is false, otherwise a list or array (see tapply).

See Also

tapply, simplify2array. array2DF to convert result to a data frame. ave also applies a function block-wise.


by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
by(warpbreaks[, 1],   warpbreaks[, -1],       summary)
by(warpbreaks, warpbreaks[,"tension"],
   function(x) lm(breaks ~ wool, data = x))

## now suppose we want to extract the coefficients by group
tmp1 <- with(warpbreaks,
            by(warpbreaks, tension,
               function(x) lm(breaks ~ wool, data = x)))
sapply(tmp1, coef)

## another way
tmp2 <- by(warpbreaks, ~ tension,
           with, coef(lm(breaks ~ wool)))
array2DF(tmp2, simplify = TRUE)

[Package base version 4.4.1 Index]