src_impala {implyr} | R Documentation |
Connect to Impala and create a remote dplyr data source
Description
src_impala
creates a SQL backend to dplyr for
Apache Impala, the massively parallel
processing query engine for Apache Hadoop.
src_impala
can work with any DBI-compatible interface that provides
connectivity to Impala. Currently, two packages that can provide this
connectivity are odbc and RJDBC.
Usage
src_impala(drv, ..., auto_disconnect = TRUE)
Arguments
drv |
an object that inherits from |
... |
arguments passed to the underlying Impala database connection
method |
auto_disconnect |
Should the connection to Impala be automatically
closed when the object returned by this function is deleted? Pass |
Value
An object with class src_impala
, src_sql
, src
See Also
Impala ODBC driver, Impala JDBC driver
Examples
# Using ODBC connectivity:
## Not run:
library(odbc)
drv <- odbc::odbc()
impala <- src_impala(
drv = drv,
driver = "Cloudera ODBC Driver for Impala",
host = "host",
port = 21050,
database = "default",
uid = "username",
pwd = "password"
)
## End(Not run)
# Using JDBC connectivity:
## Not run:
library(RJDBC)
Sys.setenv(JAVA_HOME = "/path/to/java/home/")
impala_classpath <- list.files(
path = "/path/to/jdbc/driver",
pattern = "\\.jar$",
full.names = TRUE
)
.jinit(classpath = impala_classpath)
drv <- JDBC(
driverClass = "com.cloudera.impala.jdbc41.Driver",
classPath = impala_classpath,
identifier.quote = "`"
)
impala <- src_impala(
drv,
"jdbc:impala://host:21050",
"username",
"password"
)
## End(Not run)