agg_dfm {quest} | R Documentation |
Data Information by Group
Description
agg_dfm
evaluates a function on a set of variables in a data.frame
separately for each group and combines the results back together. The
rep
and rtn.grp
arguments determine exactly how the results are
combined together. If rep
= TRUE, then the result of fun
is
repeated for every row of the group in data[grp.nm]
; If rep
=
FALSE, then the result of fun
for each unique combination of
data[grp.nm]
is returned once. If rtn.grp
= TRUE, then the
results are returned in a data.frame where the first columns are the groups
from data[grp.nm]
; If rtn.grp
= FALSE, then the results are
returned in an atomic vector. Note, agg_dfm
evaluates fun
on
all the variables in data[vrb.nm]
as a whole, If instead, you want to
evaluate fun
separately for variable vrb.nm
in data
,
then use Agg
.
Usage
agg_dfm(
data,
vrb.nm,
grp.nm,
rep = FALSE,
rtn.grp = !rep,
sep = ".",
rtn.result.nm = "result",
fun,
...
)
Arguments
data |
data.frame of data. |
vrb.nm |
character vector of colnames from |
grp.nm |
character vector of colnames from |
rep |
logical vector of length 1 specifying whether the result of
|
rtn.grp |
logical vector of length 1 specifying whether the group
columns (i.e., |
sep |
character vector of length 1 specifying the string to paste the
group values together with when there are multiple grouping variables
(i.e., |
rtn.result.nm |
character vector of length 1 specifying the name for the
column of results in the return object. Only used if |
fun |
function to evaluate each grouping of |
... |
additional named arguments to |
Details
If rep
= TRUE, then agg_dfm
calls ave_dfm
; if rep
= FALSE, then agg_dfm
calls by
. When rep
= FALSE and
rtn.grp
= TRUE, agg_dfm
is very similar to plyr::ddply
;
when rep
= FALSE and rtn.grp
= FALSE, then agg_dfm
is
very similar to plyr::daply
.
Value
result of fun
applied to each grouping of
data[vrb.nm]
. The structure of the return object depends on the
arguments rep
and rtn.grp
.
- If rep = TRUE and rtn.grp = TRUE:
then the return object is a data.frame with nrow =
nrow(data)
where the first columns aredata[grp.nm]
and the last column is the result offun
with colname =rtn.result.nm
.- If rep = TRUE and rtn.grp = FALSE:
then the return object is an atomic vector with length =
nrow(data)
where the values are the result offun
and the names =row.names(data)
.- If rep = FALSE and rtn.grp = TRUE:
then the return object is a data.frame with nrow =
length(levels(interaction(data[grp.nm])))
where the first columns are the unique group combinations indata[grp.nm]
and the last column is the result offun
with colname =rtn.result.nm
.- If rep = FALSE and rtn.grp = FALSE:
then the return object is an atomic vector with length
length(levels(interaction(data[grp.nm])))
where the values are the result offun
and the names are each group value pasted together bysep
if there are multiple grouping variables (i.e.,length(grp.nm)
> 2).
See Also
Examples
### one grouping variable
## by in base R
by(data = airquality[c("Ozone","Solar.R")], INDICES = airquality["Month"],
simplify = FALSE, FUN = function(dat) cor(dat, use = "complete")[1,2])
## rep = TRUE
# rtn.group = TRUE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = TRUE, rtn.grp = TRUE, fun = function(dat) cor(dat, use = "complete")[1,2])
# rtn.group = FALSE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = TRUE, rtn.grp = FALSE, fun = function(dat) cor(dat, use = "complete")[1,2])
## rep = FALSE
# rtn.group = TRUE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = FALSE, rtn.grp = TRUE, fun = function(dat) cor(dat, use = "complete")[1,2])
suppressWarnings(plyr::ddply(.data = airquality[c("Ozone","Solar.R","Month")],
.variables = "Month", .fun = function(dat) cor(dat, use = "complete")[1,2]))
# rtn.group = FALSE
agg_dfm(data = airquality, vrb.nm = c("Ozone","Solar.R"), grp.nm = "Month",
rep = FALSE, rtn.grp = FALSE, fun = function(dat) cor(dat, use = "complete")[1,2])
suppressWarnings(plyr::daply(.data = airquality[c("Ozone","Solar.R","Month")],
.variables = "Month", .fun = function(dat) cor(dat, use = "complete")[1,2]))
### two grouping variables
## by in base R
by(data = mtcars[c("mpg","cyl","disp")], INDICES = mtcars[c("vs","am")],
FUN = nrow, simplify = FALSE) # with multiple group columns
## rep = TRUE
# rtn.grp = TRUE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = TRUE, rtn.grp = TRUE, fun = nrow)
# rtn.grp = FALSE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = TRUE, rtn.grp = FALSE, fun = nrow)
## rep = FALSE
# rtn.grp = TRUE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = TRUE, fun = nrow)
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = TRUE, rtn.result.nm = "value", fun = nrow)
# rtn.grp = FALSE
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = FALSE, fun = nrow)
agg_dfm(data = mtcars, vrb.nm = c("mpg","cyl","disp"), grp.nm = c("vs","am"),
rep = FALSE, rtn.grp = FALSE, sep = "_", fun = nrow)