hive {hive} | R Documentation |
Hadoop Interactive Framework Control
Description
High-level functions to control Hadoop framework.
Usage
hive( new )
.hinit( hadoop_home )
hive_start( henv = hive() )
hive_stop( henv = hive() )
hive_is_available( henv = hive() )
Arguments
hadoop_home |
A character string pointing to the local Hadoop
installation. If not given, then |
henv |
An object containing the local Hadoop configuration. |
new |
An object specifying the Hadoop environment. |
Details
High-level functions to control Hadoop framework.
The function hive()
is used to get/set the Hadoop cluster
object. This object consists of an environment holding information
about the Hadoop cluster.
The function .hinit()
is used to initialize a Hadoop cluster. It
retrieves most configuration options via searching the
HADOOP_HOME
directory given as an environment variable, or,
alternatively, by searching the /etc/hadoop
directory in case
the https://www.cloudera.com distribution (i.e., CDH3) is used.
The functions hive_start()
and hive_stop()
are used to
start/stop the Hadoop framework. The latter is not applicable for
system-wide installations like CDH3.
The function hive_is_available()
is used to check the status of
a Hadoop cluster.
Value
hive()
returns an object of class "hive"
representing
the currently used cluster configuration.
hive_is_available()
returns TRUE
if the given Hadoop
framework is running.
Author(s)
Stefan Theussl
References
Apache Hadoop: https://hadoop.apache.org/.
Cloudera's distribution including Apache Hadoop (CDH): https://www.cloudera.com/downloads/cdh.html.
Examples
## read configuration and initialize a Hadoop cluster:
## Not run: h <- .hinit( "/etc/hadoop" )
## Not run: hive( h )
## Start hadoop cluster:
## Not run: hive_start()
## check the status of an Hadoop cluste:
## Not run: hive_is_available()
## return cluster configuration 'h':
hive()
## Stop hadoop cluster:
## Not run: hive_stop()