R: Installs PySpark and Python dependencies

install_pyspark {pysparklyr}

R Documentation

Installs PySpark and Python dependencies

Description

Installs PySpark and Python dependencies

Installs Databricks Connect and Python dependencies

Usage

install_pyspark(
  version = NULL,
  envname = NULL,
  python_version = NULL,
  new_env = TRUE,
  method = c("auto", "virtualenv", "conda"),
  as_job = TRUE,
  install_ml = FALSE,
  ...
)

install_databricks(
  version = NULL,
  cluster_id = NULL,
  envname = NULL,
  python_version = NULL,
  new_env = TRUE,
  method = c("auto", "virtualenv", "conda"),
  as_job = TRUE,
  install_ml = FALSE,
  ...
)

Arguments

`version`	Version of 'databricks.connect' to install. Defaults to `NULL`. If `NULL`, it will check against PyPi to get the current library version.
`envname`	The name of the Python Environment to use to install the Python libraries. Defaults to `NULL.` If `NULL`, a name will automatically be assigned based on the version that will be installed
`python_version`	The minimum required version of Python to use to create the Python environment. Defaults to `NULL`. If `NULL`, it will check against PyPi to get the minimum required Python version.
`new_env`	If `TRUE`, any existing Python virtual environment and/or Conda environment specified by `envname` is deleted first.
`method`	The installation method to use. If creating a new environment, `"auto"` (the default) is equivalent to `"virtualenv"`. Otherwise `"auto"` infers the installation method based on the type of Python environment specified by `envname`.
`as_job`	Runs the installation if using this function within the RStudio IDE.
`install_ml`	Installs ML related Python libraries. Defaults to TRUE. This is mainly for machines with limited storage to avoid installing the rather large 'torch' library if the ML features are not going to be used. This will apply to any environment backed by 'Spark' version 3.5 or above.
`...`	Passed on to `reticulate::py_install()`
`cluster_id`	Target of the cluster ID that will be used with. If provided, this value will be used to extract the cluster's version

Value

It returns no value to the R session. This function purpose is to create the 'Python' environment, and install the appropriate set of 'Python' libraries inside the new environment. During runtime, this function will send messages to the console describing the steps that the function is taking. For example, it will let the user know if it is getting the latest version of the Python library from 'PyPi.org', and the result of such query.

[Package pysparklyr version 0.1.5 Index]