groupBy {caroline}R Documentation

Group a datafame by a factor and perform aggreate functions.

Description

The R equvalent of a SQL 'group by' call.

Usage

groupBy(df, by, aggregation,  clmns=names(df), collapse=',',
                distinct=FALSE, sql=FALSE, full.names=FALSE, ...)

Arguments

df

a data frame.

by

the factor (or name of a factor in df) used to determine the grouping.

aggregation

the functions to perform on the output (default is to sum). Suggested functions are: 'sum','mean','var','sd','max','min','length','paste',NULL.

clmns

the colums to include in the output.

collapse

string delimiter for columns aggregated via 'paste' concatenation.

distinct

used in conjunction with paste and collapse to only return unique elements in a delimited concatenated string

sql

whether or not to use SQLite to perform the grouping (not yet implimented).

full.names

names of the aggregation functions should be appended to the output column names

...

additional parameters (such as na.rm) passed to the underlying aggregate functions.

Value

an summary/aggregate dataframe

See Also

aggregate, bestBy

Examples

df <- data.frame(a=runif(12),b=c(runif(11),NA), 
                 z=rep(letters[13:18],2),w=rep(letters[20:23],3))

groupBy(df=df, by='w', clmns=c(rep(c('a','b'),3),'z','w'), 
 aggregation=c('sum','mean','var','sd','min','max','paste','length'), 
 full.names=TRUE, na.rm=TRUE)
# or using SQLite
groupBy(df=df, by='w', clmns=c(rep(c('a','b'),2),'z','w'), 
        aggregation=c('sum','mean','min','max','paste','length'), 
        full.names=TRUE, sql=TRUE)


## passing a custom function
meantop <- function(x,n=2, ...)
  mean(x[order(x, decreasing=TRUE)][1:n], ...)
  
groupBy(df, by='w', aggregation=rep(c('mean','max','meantop'),2), 
                    clmns=rep(c('a','b'),3), na.rm=TRUE)


[Package caroline version 0.9.2 Index]