| agg_dfm {quest} | R Documentation |
Data Information by Group
Description
agg_dfm evaluates a function on a set of variables in a data.frame
separately for each group and combines the results back together. The
rep and rtn.grp arguments determine exactly how the results are
combined together. If rep = TRUE, then the result of fun is
repeated for every row of the group in data[grp.nm]; If rep =
FALSE, then the result of fun for each unique combination of
data[grp.nm] is returned once. If rtn.grp = TRUE, then the
results are returned in a data.frame where the first columns are the groups
from data[grp.nm]; If rtn.grp = FALSE, then the results are
returned in an atomic vector. Note, agg_dfm evaluates fun on
all the variables in data[vrb.nm] as a whole, If instead, you want to
evaluate fun separately for variable vrb.nm in data,
then use Agg.
Usage
agg_dfm(
data,
vrb.nm,
grp.nm,
rep = FALSE,
rtn.grp = !rep,
sep = ".",
rtn.result.nm = "result",
fun,
...
)
Arguments
data |
data.frame of data. |
vrb.nm |
character vector of colnames from |
grp.nm |
character vector of colnames from |
rep |
logical vector of length 1 specifying whether the result of
|
rtn.grp |
logical vector of length 1 specifying whether the group
columns (i.e., |
sep |
character vector of length 1 specifying the string to paste the
group values together with when there are multiple grouping variables
(i.e., |
rtn.result.nm |
character vector of length 1 specifying the name for the
column of results in the return object. Only used if |
fun |
function to evaluate each grouping of |
... |
additional named arguments to |
Details
If rep = TRUE, then agg_dfm calls ave_dfm; if rep
= FALSE, then agg_dfm calls by. When rep = FALSE and
rtn.grp = TRUE, agg_dfm is very similar to plyr::ddply;
when rep = FALSE and rtn.grp = FALSE, then agg_dfm is
very similar to plyr::daply.
Value
result of fun applied to each grouping of
data[vrb.nm]. The structure of the return object depends on the
arguments rep and rtn.grp.
- If rep = TRUE and rtn.grp = TRUE:
then the return object is a data.frame with nrow =
nrow(data)where the first columns aredata[grp.nm]and the last column is the result offunwith colname =rtn.result.nm.- If rep = TRUE and rtn.grp = FALSE:
then the return object is an atomic vector with length =
nrow(data)where the values are the result offunand the names =row.names(data).- If rep = FALSE and rtn.grp = TRUE:
then the return object is a data.frame with nrow =
length(levels(interaction(data[grp.nm])))where the first columns are the unique group combinations indata[grp.nm]and the last column is the result offunwith colname =rtn.result.nm.- If rep = FALSE and rtn.grp = FALSE:
then the return object is an atomic vector with length
length(levels(interaction(data[grp.nm])))where the values are the result offunand the names are each group value pasted together bysepif there are multiple grouping variables (i.e.,length(grp.nm)> 2).
See Also
Examples
### one grouping variable
## by in base R
by(data = airquality[c("Ozone","Solar.R")], INDICES = airquality["Month"],
simplify = FALSE, FUN = function(dat) cor(dat, use = "complete")[1,2])
## rep = TRUE
# rtn.group = TRUE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = TRUE, rtn.grp = TRUE, fun = function(dat) cor(dat, use = "complete")[1,2])
# rtn.group = FALSE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = TRUE, rtn.grp = FALSE, fun = function(dat) cor(dat, use = "complete")[1,2])
## rep = FALSE
# rtn.group = TRUE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = FALSE, rtn.grp = TRUE, fun = function(dat) cor(dat, use = "complete")[1,2])
suppressWarnings(plyr::ddply(.data = airquality[c("Ozone","Solar.R","Month")],
.variables = "Month", .fun = function(dat) cor(dat, use = "complete")[1,2]))
# rtn.group = FALSE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = FALSE, rtn.grp = FALSE, fun = function(dat) cor(dat, use = "complete")[1,2])
suppressWarnings(plyr::daply(.data = airquality[c("Ozone","Solar.R","Month")],
.variables = "Month", .fun = function(dat) cor(dat, use = "complete")[1,2]))
### two grouping variables
## by in base R
by(data = mtcars[c("mpg","cyl","disp")], INDICES = mtcars[c("vs","am")],
FUN = nrow, simplify = FALSE) # with multiple group columns
## rep = TRUE
# rtn.grp = TRUE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = TRUE, rtn.grp = TRUE, fun = nrow)
# rtn.grp = FALSE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = TRUE, rtn.grp = FALSE, fun = nrow)
## rep = FALSE
# rtn.grp = TRUE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = TRUE, fun = nrow)
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = TRUE, rtn.result.nm = "value", fun = nrow)
# rtn.grp = FALSE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = FALSE, fun = nrow)
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = FALSE, sep = "_", fun = nrow)