install_pyspark {pysparklyr} | R Documentation |
Installs PySpark and Python dependencies
Description
Installs PySpark and Python dependencies
Installs Databricks Connect and Python dependencies
Usage
install_pyspark(
version = NULL,
envname = NULL,
python_version = NULL,
new_env = TRUE,
method = c("auto", "virtualenv", "conda"),
as_job = TRUE,
install_ml = FALSE,
...
)
install_databricks(
version = NULL,
cluster_id = NULL,
envname = NULL,
python_version = NULL,
new_env = TRUE,
method = c("auto", "virtualenv", "conda"),
as_job = TRUE,
install_ml = FALSE,
...
)
Arguments
version |
Version of 'databricks.connect' to install. Defaults to |
envname |
The name of the Python Environment to use to install the
Python libraries. Defaults to |
python_version |
The minimum required version of Python to use to create
the Python environment. Defaults to |
new_env |
If |
method |
The installation method to use. If creating a new environment,
|
as_job |
Runs the installation if using this function within the RStudio IDE. |
install_ml |
Installs ML related Python libraries. Defaults to TRUE. This is mainly for machines with limited storage to avoid installing the rather large 'torch' library if the ML features are not going to be used. This will apply to any environment backed by 'Spark' version 3.5 or above. |
... |
Passed on to |
cluster_id |
Target of the cluster ID that will be used with. If provided, this value will be used to extract the cluster's version |
Value
It returns no value to the R session. This function purpose is to create the 'Python' environment, and install the appropriate set of 'Python' libraries inside the new environment. During runtime, this function will send messages to the console describing the steps that the function is taking. For example, it will let the user know if it is getting the latest version of the Python library from 'PyPi.org', and the result of such query.