spark-api {sparklyr} | R Documentation |
Access the Spark API
Description
Access the commonly-used Spark objects associated with a Spark instance. These objects provide access to different facets of the Spark API.
Usage
spark_context(sc)
java_context(sc)
hive_context(sc)
spark_session(sc)
Arguments
sc |
A |
Details
The Scala API documentation
is useful for discovering what methods are available for each of these
objects. Use invoke
to call methods on these objects.
Spark Context
The main entry point for Spark functionality. The Spark Context
represents the connection to a Spark cluster, and can be used to create
RDD
s, accumulators and broadcast variables on that cluster.
Java Spark Context
A Java-friendly version of the aforementioned Spark Context.
Hive Context
An instance of the Spark SQL execution engine that integrates with data
stored in Hive. Configuration for Hive is read from hive-site.xml
on
the classpath.
Starting with Spark >= 2.0.0, the Hive Context class has been
deprecated – it is superceded by the Spark Session class, and
hive_context
will return a Spark Session object instead.
Note that both classes share a SQL interface, and therefore one can invoke
SQL through these objects.
Spark Session
Available since Spark 2.0.0, the Spark Session unifies the Spark Context and Hive Context classes into a single interface. Its use is recommended over the older APIs for code targeting Spark 2.0.0 and above.