daply {plyr} | R Documentation |
Split data frame, apply function, and return results in an array.
Description
For each subset of data frame, apply function then combine results into
an array. daply
with a function that operates column-wise is
similar to aggregate
.
To apply a function for each row, use aaply
with
.margins
set to 1
.
Usage
daply(
.data,
.variables,
.fun = NULL,
...,
.progress = "none",
.inform = FALSE,
.drop_i = TRUE,
.drop_o = TRUE,
.parallel = FALSE,
.paropts = NULL
)
Arguments
.data |
data frame to be processed |
.variables |
variables to split data frame by, as quoted variables, a formula or character vector |
.fun |
function to apply to each piece |
... |
other arguments passed on to |
.progress |
name of the progress bar to use, see
|
.inform |
produce informative error messages? This is turned off by default because it substantially slows processing speed, but is very useful for debugging |
.drop_i |
should combinations of variables that do not appear in the input data be preserved (FALSE) or dropped (TRUE, default) |
.drop_o |
should extra dimensions of length 1 in the output be
dropped, simplifying the output. Defaults to |
.parallel |
if |
.paropts |
a list of additional options passed into
the |
Value
if results are atomic with same type and dimensionality, a vector, matrix or array; otherwise, a list-array (a list with dimensions)
Input
This function splits data frames by variables.
Output
If there are no results, then this function will return a vector of
length 0 (vector()
).
References
Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of Statistical Software, 40(1), 1-29. https://www.jstatsoft.org/v40/i01/.
See Also
Other array output:
aaply()
,
laply()
,
maply()
Other data frame input:
d_ply()
,
ddply()
,
dlply()
Examples
daply(baseball, .(year), nrow)
# Several different ways of summarising by variables that should not be
# included in the summary
daply(baseball[, c(2, 6:9)], .(year), colwise(mean))
daply(baseball[, 6:9], .(baseball$year), colwise(mean))
daply(baseball, .(year), function(df) colwise(mean)(df[, 6:9]))