bestBy {caroline} | R Documentation |
Find the "best" record within subgroups of a dataframe.
Description
Finding the an extreme record for each group within a dataset is a more challenging routine task in R and SQL. This function provides a easy interface to that functionality either using R (fast for small data frames) or SQL (fastest for large data)
Usage
bestBy(df, by, best, clmns=names(df), inverse=FALSE, sql=FALSE)
Arguments
df |
a data frame. |
by |
the factor (or name of a factor in df) used to determine the grouping. |
clmns |
the colums to include in the output. |
best |
the column to sort on (both globally and for each sub/group) |
inverse |
the sorting order of the sort column as specified by 'best' |
sql |
whether or not to use SQLite to perform the operation. |
Value
A data frame of 'best' records from each factor level
Author(s)
David Schruth
See Also
Examples
blast.results <- data.frame(score=c(1,2,34,4,5,3,23),
query=c('z','x','y','z','x','y','z'),
target=c('a','b','c','d','e','f','g')
)
best.hits.R <- bestBy(blast.results, by='query', best='score', inverse=TRUE)
best.hits.R
## or using SQLite
best.hits.sql <- bestBy(blast.results, by='query', best='score', inverse=TRUE, sql=TRUE)
best.hits.sql
[Package caroline version 0.9.2 Index]