frameApply {gdata} | R Documentation |
Subset analysis on data frames
Description
Apply a function to row subsets of a data frame.
Usage
frameApply(x, by=NULL, on=by[1], fun=function(xi) c(Count=nrow(xi)),
subset=TRUE, simplify=TRUE, byvar.sep="\\$\\@\\$", ...)
Arguments
x |
a data frame |
by |
names of columns in |
on |
names of columns in |
fun |
a function that can operate on data frames that are row
subsets of |
subset |
logical vector (can be specified in terms of variables
in data). This row subset of |
simplify |
logical. If TRUE (the default), return value will
be a data frame including the |
byvar.sep |
character. This can be any character string not
found anywhere in the values of the |
... |
additional arguments to |
Details
This function accomplishes something similar to
by
. The main difference is that frameApply
is
designed to return data frames and lists instead of objects of class
'by'. Also, frameApply
works only on the unique combinations of
the by
that are actually present in the data, not on the entire
cartesian product of the by
variables. In some cases this
results in great gains in efficiency, although frameApply
is
hardly an efficient function.
Value
A data frame if simplify = TRUE
(the default), assuming
there is sufficiently structured output from fun
. If
simplify = FALSE
and by
is not NULL, the return value
will be a list with two elements. The first element, named "by", will
be a data frame with the unique rows of x[by]
, and the second
element, named "result" will be a list where the ith
component gives the result for the ith row of the "by" element.
Author(s)
Jim Rogers james.a.rogers@pfizer.com
Examples
data(ELISA, package="gtools")
# Default is slightly unintuitive, but commonly useful:
frameApply(ELISA, by = c("PlateDay", "Read"))
# Wouldn't actually recommend this model! Just a demo:
frameApply(ELISA, on = c("Signal", "Concentration"), by = c("PlateDay", "Read"),
fun = function(dat) coef(lm(Signal ~ Concentration, data = dat)))
frameApply(ELISA, on = "Signal", by = "Concentration",
fun = function(dat) {
x <- dat[[1]]
out <- c(Mean = mean(x, na.rm=TRUE),
SD = sd(x, na.rm=TRUE),
N = sum(x, na.rm=TRUE))},
subset = !is.na(Concentration))