dbApply {RMySQL} | R Documentation |
Apply R/S-Plus functions to remote groups of DBMS rows (experimental)
Description
Applies R/S-Plus functions to groups of remote DBMS rows without bringing an entire result set all at once. The result set is expected to be sorted by the grouping field.
Usage
dbApply(res, ...)
## S4 method for signature 'MySQLResult'
dbApply(
res,
INDEX,
FUN = stop("must specify FUN"),
begin = NULL,
group.begin = NULL,
new.record = NULL,
end = NULL,
batchSize = 100,
maxBatch = 1e+06,
...,
simplify = TRUE
)
Arguments
res |
a result set (see |
... |
any additional arguments to be passed to |
INDEX |
a character or integer specifying the field name or field number that defines the various groups. |
FUN |
a function to be invoked upon identifying the last row from every
group. This function will be passed a data frame holding the records of the
current group, a character string with the group label, plus any other
arguments passed to |
begin |
a function of no arguments to be invoked just prior to retrieve the first row from the result set. |
group.begin |
a function of one argument (the group label) to be invoked upon identifying a row from a new group |
new.record |
a function to be invoked as each individual record is fetched. The first argument to this function is a one-row data.frame holding the new record. |
end |
a function of no arguments to be invoked just after retrieving the last row from the result set. |
batchSize |
the default number of rows to bring from the remote result
set. If needed, this is automatically extended to hold groups bigger than
|
maxBatch |
the absolute maximum of rows per group that may be extracted from the result set. |
simplify |
Not yet implemented |
Details
This function is meant to handle somewhat gracefully(?) large
amounts of data from the DBMS by bringing into R manageable chunks (about
batchSize
records at a time, but not more than maxBatch
); the
idea is that the data from individual groups can be handled by R, but not
all the groups at the same time.
Value
A list with as many elements as there were groups in the result set.
Examples
if (mysqlHasDefault()) {
con <- dbConnect(RMySQL::MySQL(), dbname = "test")
dbWriteTable(con, "mtcars", mtcars, overwrite = TRUE)
res <- dbSendQuery(con, "SELECT * FROM mtcars ORDER BY cyl")
dbApply(res, "cyl", function(x, grp) quantile(x$mpg, names=FALSE))
dbClearResult(res)
dbRemoveTable(con, "mtcars")
dbDisconnect(con)
}