read.jdbc.ffdf {ETLUtils} | R Documentation |
Read data from a JDBC connection into an ffdf.
Description
Read data from a JDBC connection into an ffdf
. This can for example be used to import
large datasets from Oracle, SQLite, MySQL, PostgreSQL, Hive or other SQL databases into R.
Usage
read.jdbc.ffdf(
query = NULL,
dbConnect.args = list(drv = NULL, dbname = NULL, username = "", password = ""),
dbSendQuery.args = list(),
dbFetch.args = list(),
x = NULL,
nrows = -1,
first.rows = NULL,
next.rows = NULL,
levels = NULL,
appendLevels = TRUE,
asffdf_args = list(),
BATCHBYTES = getOption("ffbatchbytes"),
VERBOSE = FALSE,
colClasses = NULL,
transFUN = NULL,
...
)
Arguments
query |
the SQL query to execute on the JDBC connection |
dbConnect.args |
a list of arguments to pass to JDBC's |
dbSendQuery.args |
a list containing database-specific parameters which will be passed to to pass to |
dbFetch.args |
a list containing optional database-specific parameters which will be passed to to pass to |
x |
NULL or an optional ffdf object to which the read records are appended. See documentation in read.table.ffdf for more details and the example below. |
nrows |
Number of rows to read from the query resultset. Default value of -1 reads in all rows. |
first.rows |
chunk size (rows) to read for first chunk from the query resultset |
next.rows |
chunk size (rows) to read sequentially for subsequent chunks from the query resultset. Currently, this must be specified. |
levels |
optional specification of factor levels. A list with as names the names the columns of the data.frame fetched in the first.rows, containing levels of the factors. |
appendLevels |
logical. A vector of permissions to expand levels for factor columns. See documentation in |
asffdf_args |
further arguments passed to |
BATCHBYTES |
integer: bytes allowed for the size of the data.frame storing the result of reading one chunk.
See documentation in |
VERBOSE |
logical: TRUE to verbose timings for each processed chunk (default FALSE). |
colClasses |
See documentation in |
transFUN |
function applied to the data frame after each chunk is retreived by |
... |
optional parameters passed on to transFUN |
Details
Opens up the JDBC connection using RJDBC::dbConnect
, sends the query using RJDBC::dbSendQuery
and RJDBC::dbFetch
-es
the results in batches of next.rows rows. Heavily borrowed from read.table.ffdf
Value
An ffdf object unless the query returns zero records in which case the function will return the data.frame
returned by RJDBC::dbFetch
and possibly transFUN.
See Also
read.table.ffdf, read.jdbc.ffdf
Examples
## Not run:
require(ff)
##
## Example query using data in sqlite
##
require(RSQLite)
dbfile <- system.file("smalldb.sqlite3", package="ETLUtils")
drv <- JDBC(driverClass = "org.sqlite.JDBC", classPath = "/usr/local/lib/sqlite-jdbc-3.7.2.jar")
query <- "select * from testdata limit 10000"
x <- read.jdbc.ffdf(query = query,
dbConnect.args = list(drv = drv, url = sprintf("jdbc:sqlite:%s", dbfile)),
first.rows = 100, next.rows = 1000, VERBOSE=TRUE)
class(x)
x[1:10, ]
## End(Not run)