R Interface to Apache Spark


[Up] [Top]

Documentation for package ‘sparklyr’ version 1.8.6

Help Pages

A C D E F G H I J L M N P R S T U misc

-- A --

augment.ml_model_aft_survival_regression Tidying methods for Spark ML Survival Regression
augment.ml_model_als Tidying methods for Spark ML ALS
augment.ml_model_bisecting_kmeans Tidying methods for Spark ML unsupervised models
augment.ml_model_decision_tree_classification Tidying methods for Spark ML tree models
augment.ml_model_decision_tree_regression Tidying methods for Spark ML tree models
augment.ml_model_gaussian_mixture Tidying methods for Spark ML unsupervised models
augment.ml_model_gbt_classification Tidying methods for Spark ML tree models
augment.ml_model_gbt_regression Tidying methods for Spark ML tree models
augment.ml_model_generalized_linear_regression Tidying methods for Spark ML linear models
augment.ml_model_isotonic_regression Tidying methods for Spark ML Isotonic Regression
augment.ml_model_kmeans Tidying methods for Spark ML unsupervised models
augment.ml_model_lda Tidying methods for Spark ML LDA models
augment.ml_model_linear_regression Tidying methods for Spark ML linear models
augment.ml_model_linear_svc Tidying methods for Spark ML linear svc
augment.ml_model_logistic_regression Tidying methods for Spark ML Logistic Regression
augment.ml_model_multilayer_perceptron_classification Tidying methods for Spark ML MLP
augment.ml_model_naive_bayes Tidying methods for Spark ML Naive Bayes
augment.ml_model_pca Tidying methods for Spark ML Principal Component Analysis
augment.ml_model_random_forest_classification Tidying methods for Spark ML tree models
augment.ml_model_random_forest_regression Tidying methods for Spark ML tree models
augment._ml_model_decision_tree_classification Tidying methods for Spark ML tree models
augment._ml_model_decision_tree_regression Tidying methods for Spark ML tree models
augment._ml_model_gbt_classification Tidying methods for Spark ML tree models
augment._ml_model_gbt_regression Tidying methods for Spark ML tree models
augment._ml_model_linear_regression Tidying methods for Spark ML linear models
augment._ml_model_logistic_regression Tidying methods for Spark ML Logistic Regression
augment._ml_model_random_forest_classification Tidying methods for Spark ML tree models
augment._ml_model_random_forest_regression Tidying methods for Spark ML tree models

-- C --

checkpoint_directory Set/Get Spark checkpoint directory
collect_from_rds Collect Spark data serialized in RDS format into R
compile_package_jars Compile Scala sources into a Java Archive (jar)
connection_config Read configuration values for a connection
copy_to.spark_connection Copy an R Data Frame to Spark

-- D --

distinct Distinct
download_scalac Downloads default Scala Compilers
dplyr_hof dplyr wrappers for Apache Spark higher order functions

-- E --

ensure Enforce Specific Structure for R Objects

-- F --

fill Fill
filter Filter
find_scalac Discover the Scala Compiler
ft_binarizer Feature Transformation - Binarizer (Transformer)
ft_bucketed_random_projection_lsh Feature Transformation - LSH (Estimator)
ft_bucketizer Feature Transformation - Bucketizer (Transformer)
ft_chisq_selector Feature Transformation - ChiSqSelector (Estimator)
ft_count_vectorizer Feature Transformation - CountVectorizer (Estimator)
ft_dct Feature Transformation - Discrete Cosine Transform (DCT) (Transformer)
ft_discrete_cosine_transform Feature Transformation - Discrete Cosine Transform (DCT) (Transformer)
ft_dplyr_transformer Feature Transformation - SQLTransformer
ft_elementwise_product Feature Transformation - ElementwiseProduct (Transformer)
ft_feature_hasher Feature Transformation - FeatureHasher (Transformer)
ft_hashing_tf Feature Transformation - HashingTF (Transformer)
ft_idf Feature Transformation - IDF (Estimator)
ft_imputer Feature Transformation - Imputer (Estimator)
ft_index_to_string Feature Transformation - IndexToString (Transformer)
ft_interaction Feature Transformation - Interaction (Transformer)
ft_lsh Feature Transformation - LSH (Estimator)
ft_lsh_utils Utility functions for LSH models
ft_max_abs_scaler Feature Transformation - MaxAbsScaler (Estimator)
ft_minhash_lsh Feature Transformation - LSH (Estimator)
ft_min_max_scaler Feature Transformation - MinMaxScaler (Estimator)
ft_ngram Feature Transformation - NGram (Transformer)
ft_normalizer Feature Transformation - Normalizer (Transformer)
ft_one_hot_encoder Feature Transformation - OneHotEncoder (Transformer)
ft_one_hot_encoder_estimator Feature Transformation - OneHotEncoderEstimator (Estimator)
ft_pca Feature Transformation - PCA (Estimator)
ft_polynomial_expansion Feature Transformation - PolynomialExpansion (Transformer)
ft_quantile_discretizer Feature Transformation - QuantileDiscretizer (Estimator)
ft_regex_tokenizer Feature Transformation - RegexTokenizer (Transformer)
ft_robust_scaler Feature Transformation - RobustScaler (Estimator)
ft_r_formula Feature Transformation - RFormula (Estimator)
ft_sql_transformer Feature Transformation - SQLTransformer
ft_standard_scaler Feature Transformation - StandardScaler (Estimator)
ft_stop_words_remover Feature Transformation - StopWordsRemover (Transformer)
ft_string_indexer Feature Transformation - StringIndexer (Estimator)
ft_string_indexer_model Feature Transformation - StringIndexer (Estimator)
ft_tokenizer Feature Transformation - Tokenizer (Transformer)
ft_vector_assembler Feature Transformation - VectorAssembler (Transformer)
ft_vector_indexer Feature Transformation - VectorIndexer (Estimator)
ft_vector_slicer Feature Transformation - VectorSlicer (Transformer)
ft_word2vec Feature Transformation - Word2Vec (Estimator)
full_join Full join
full_join.tbl_spark Join Spark tbls.

-- G --

generic_call_interface Generic Call Interface
get_spark_sql_catalog_implementation Retrieve the Spark connection's SQL catalog implementation property
glance.ml_model_aft_survival_regression Tidying methods for Spark ML Survival Regression
glance.ml_model_als Tidying methods for Spark ML ALS
glance.ml_model_bisecting_kmeans Tidying methods for Spark ML unsupervised models
glance.ml_model_decision_tree_classification Tidying methods for Spark ML tree models
glance.ml_model_decision_tree_regression Tidying methods for Spark ML tree models
glance.ml_model_gaussian_mixture Tidying methods for Spark ML unsupervised models
glance.ml_model_gbt_classification Tidying methods for Spark ML tree models
glance.ml_model_gbt_regression Tidying methods for Spark ML tree models
glance.ml_model_generalized_linear_regression Tidying methods for Spark ML linear models
glance.ml_model_isotonic_regression Tidying methods for Spark ML Isotonic Regression
glance.ml_model_kmeans Tidying methods for Spark ML unsupervised models
glance.ml_model_lda Tidying methods for Spark ML LDA models
glance.ml_model_linear_regression Tidying methods for Spark ML linear models
glance.ml_model_linear_svc Tidying methods for Spark ML linear svc
glance.ml_model_logistic_regression Tidying methods for Spark ML Logistic Regression
glance.ml_model_multilayer_perceptron_classification Tidying methods for Spark ML MLP
glance.ml_model_naive_bayes Tidying methods for Spark ML Naive Bayes
glance.ml_model_pca Tidying methods for Spark ML Principal Component Analysis
glance.ml_model_random_forest_classification Tidying methods for Spark ML tree models
glance.ml_model_random_forest_regression Tidying methods for Spark ML tree models

-- H --

hive_context Access the Spark API
hive_context_config Runtime configuration interface for Hive
hof_aggregate Apply Aggregate Function to Array Column
hof_array_sort Sorts array using a custom comparator
hof_exists Determine Whether Some Element Exists in an Array Column
hof_filter Filter Array Column
hof_forall Checks whether all elements in an array satisfy a predicate
hof_map_filter Filters a map
hof_map_zip_with Merges two maps into one
hof_transform Transform Array Column
hof_transform_keys Transforms keys of a map
hof_transform_values Transforms values of a map
hof_zip_with Combines 2 Array Columns

-- I --

inner_join Inner join
inner_join.tbl_spark Join Spark tbls.
invoke Invoke a Method on a JVM Object
invoke_new Invoke a Method on a JVM Object
invoke_static Invoke a Method on a JVM Object
is_ml_estimator Spark ML - Transform, fit, and predict methods (ml_ interface)
is_ml_transformer Spark ML - Transform, fit, and predict methods (ml_ interface)

-- J --

jarray Instantiate a Java array with a specific element type.
java_context Access the Spark API
jfloat Instantiate a Java float type.
jfloat_array Instantiate an Array[Float].
join.tbl_spark Join Spark tbls.
j_invoke Invoke a Java function.
j_invoke_new Invoke a Java function.
j_invoke_static Invoke a Java function.

-- L --

left_join Left join
left_join.tbl_spark Join Spark tbls.
list_sparklyr_jars list all sparklyr-*.jar files that have been built
livy_config Create a Spark Configuration for Livy
livy_service_start Start Livy
livy_service_stop Start Livy

-- M --

ml-params Spark ML - ML Params
ml-persistence Spark ML - Model Persistence
ml-transform-methods Spark ML - Transform, fit, and predict methods (ml_ interface)
ml-tuning Spark ML - Tuning
ml_aft_survival_regression Spark ML - Survival Regression
ml_als Spark ML - ALS
ml_als_tidiers Tidying methods for Spark ML ALS
ml_approx_nearest_neighbors Utility functions for LSH models
ml_approx_similarity_join Utility functions for LSH models
ml_association_rules Frequent Pattern Mining - FPGrowth
ml_binary_classification_eval Spark ML - Evaluators
ml_binary_classification_evaluator Spark ML - Evaluators
ml_bisecting_kmeans Spark ML - Bisecting K-Means Clustering
ml_chisquare_test Chi-square hypothesis testing for categorical data.
ml_classification_eval Spark ML - Evaluators
ml_clustering_evaluator Spark ML - Clustering Evaluator
ml_compute_cost Spark ML - K-Means Clustering
ml_compute_silhouette_measure Spark ML - K-Means Clustering
ml_corr Compute correlation matrix
ml_cross_validator Spark ML - Tuning
ml_decision_tree Spark ML - Decision Trees
ml_decision_tree_classifier Spark ML - Decision Trees
ml_decision_tree_regressor Spark ML - Decision Trees
ml_default_stop_words Default stop words
ml_describe_topics Spark ML - Latent Dirichlet Allocation
ml_evaluate Evaluate the Model on a Validation Set
ml_evaluate.ml_evaluator Evaluate the Model on a Validation Set
ml_evaluate.ml_generalized_linear_regression_model Evaluate the Model on a Validation Set
ml_evaluate.ml_linear_regression_model Evaluate the Model on a Validation Set
ml_evaluate.ml_logistic_regression_model Evaluate the Model on a Validation Set
ml_evaluate.ml_model_classification Evaluate the Model on a Validation Set
ml_evaluate.ml_model_clustering Evaluate the Model on a Validation Set
ml_evaluate.ml_model_generalized_linear_regression Evaluate the Model on a Validation Set
ml_evaluate.ml_model_linear_regression Evaluate the Model on a Validation Set
ml_evaluate.ml_model_logistic_regression Evaluate the Model on a Validation Set
ml_evaluator Spark ML - Evaluators
ml_feature_importances Spark ML - Feature Importance for Tree Models
ml_find_synonyms Feature Transformation - Word2Vec (Estimator)
ml_fit Spark ML - Transform, fit, and predict methods (ml_ interface)
ml_fit.default Spark ML - Transform, fit, and predict methods (ml_ interface)
ml_fit_and_transform Spark ML - Transform, fit, and predict methods (ml_ interface)
ml_fpgrowth Frequent Pattern Mining - FPGrowth
ml_freq_itemsets Frequent Pattern Mining - FPGrowth
ml_freq_seq_patterns Frequent Pattern Mining - PrefixSpan
ml_gaussian_mixture Spark ML - Gaussian Mixture clustering.
ml_gbt_classifier Spark ML - Gradient Boosted Trees
ml_gbt_regressor Spark ML - Gradient Boosted Trees
ml_generalized_linear_regression Spark ML - Generalized Linear Regression
ml_glm_tidiers Tidying methods for Spark ML linear models
ml_gradient_boosted_trees Spark ML - Gradient Boosted Trees
ml_isotonic_regression Spark ML - Isotonic Regression
ml_isotonic_regression_tidiers Tidying methods for Spark ML Isotonic Regression
ml_is_set Spark ML - ML Params
ml_kmeans Spark ML - K-Means Clustering
ml_kmeans_cluster_eval Evaluate a K-mean clustering
ml_labels Feature Transformation - StringIndexer (Estimator)
ml_lda Spark ML - Latent Dirichlet Allocation
ml_lda_tidiers Tidying methods for Spark ML LDA models
ml_linear_regression Spark ML - Linear Regression
ml_linear_svc Spark ML - LinearSVC
ml_linear_svc_tidiers Tidying methods for Spark ML linear svc
ml_load Spark ML - Model Persistence
ml_logistic_regression Spark ML - Logistic Regression
ml_logistic_regression_tidiers Tidying methods for Spark ML Logistic Regression
ml_log_likelihood Spark ML - Latent Dirichlet Allocation
ml_log_perplexity Spark ML - Latent Dirichlet Allocation
ml_metrics_binary Extracts metrics from a fitted table
ml_metrics_multiclass Extracts metrics from a fitted table
ml_metrics_regression Extracts metrics from a fitted table
ml_model_data Extracts data associated with a Spark ML model
ml_multiclass_classification_evaluator Spark ML - Evaluators
ml_multilayer_perceptron Spark ML - Multilayer Perceptron
ml_multilayer_perceptron_classifier Spark ML - Multilayer Perceptron
ml_multilayer_perceptron_tidiers Tidying methods for Spark ML MLP
ml_naive_bayes Spark ML - Naive-Bayes
ml_naive_bayes_tidiers Tidying methods for Spark ML Naive Bayes
ml_one_vs_rest Spark ML - OneVsRest
ml_param Spark ML - ML Params
ml_params Spark ML - ML Params
ml_param_map Spark ML - ML Params
ml_pca Feature Transformation - PCA (Estimator)
ml_pca_tidiers Tidying methods for Spark ML Principal Component Analysis
ml_pipeline Spark ML - Pipelines
ml_power_iteration Spark ML - Power Iteration Clustering
ml_predict Spark ML - Transform, fit, and predict methods (ml_ interface)
ml_predict.ml_model_classification Spark ML - Transform, fit, and predict methods (ml_ interface)
ml_prefixspan Frequent Pattern Mining - PrefixSpan
ml_random_forest Spark ML - Random Forest
ml_random_forest_classifier Spark ML - Random Forest
ml_random_forest_regressor Spark ML - Random Forest
ml_recommend Spark ML - ALS
ml_regression_evaluator Spark ML - Evaluators
ml_save Spark ML - Model Persistence
ml_save.ml_model Spark ML - Model Persistence
ml_stage Spark ML - Pipeline stage extraction
ml_stages Spark ML - Pipeline stage extraction
ml_sub_models Spark ML - Tuning
ml_summary Spark ML - Extraction of summary metrics
ml_survival_regression Spark ML - Survival Regression
ml_survival_regression_tidiers Tidying methods for Spark ML Survival Regression
ml_topics_matrix Spark ML - Latent Dirichlet Allocation
ml_train_validation_split Spark ML - Tuning
ml_transform Spark ML - Transform, fit, and predict methods (ml_ interface)
ml_tree_feature_importance Spark ML - Feature Importance for Tree Models
ml_tree_tidiers Tidying methods for Spark ML tree models
ml_uid Spark ML - UID
ml_unsupervised_tidiers Tidying methods for Spark ML unsupervised models
ml_validation_metrics Spark ML - Tuning
ml_vocabulary Feature Transformation - CountVectorizer (Estimator)
mutate Mutate

-- N --

na.replace Replace Missing Values in Objects
nest Nest

-- P --

pivot_longer Pivot longer
pivot_wider Pivot wider

-- R --

random_string Random string generation
reactiveSpark Reactive spark reader
registerDoSpark Register a Parallel Backend
registered_extensions Register a Package that Implements a Spark Extension
register_extension Register a Package that Implements a Spark Extension
replace_na Replace NA
right_join Right join
right_join.tbl_spark Join Spark tbls.

-- S --

sdf-saveload Save / Load a Spark DataFrame
sdf-transform-methods Spark ML - Transform, fit, and predict methods (sdf_ interface)
sdf_along Create DataFrame for along Object
sdf_bind Bind multiple Spark DataFrames by row and column
sdf_bind_cols Bind multiple Spark DataFrames by row and column
sdf_bind_rows Bind multiple Spark DataFrames by row and column
sdf_broadcast Broadcast hint
sdf_checkpoint Checkpoint a Spark DataFrame
sdf_coalesce Coalesces a Spark DataFrame
sdf_collect Collect a Spark DataFrame into R.
sdf_copy_to Copy an Object into Spark
sdf_crosstab Cross Tabulation
sdf_debug_string Debug Info for Spark DataFrame
sdf_describe Compute summary statistics for columns of a data frame
sdf_dim Support for Dimension Operations
sdf_distinct Invoke distinct on a Spark DataFrame
sdf_drop_duplicates Remove duplicates from a Spark DataFrame
sdf_expand_grid Create a Spark dataframe containing all combinations of inputs
sdf_fit Spark ML - Transform, fit, and predict methods (sdf_ interface)
sdf_fit_and_transform Spark ML - Transform, fit, and predict methods (sdf_ interface)
sdf_from_avro Convert column(s) from avro format
sdf_import Copy an Object into Spark
sdf_is_streaming Spark DataFrame is Streaming
sdf_last_index Returns the last index of a Spark DataFrame
sdf_len Create DataFrame for Length
sdf_load_parquet Save / Load a Spark DataFrame
sdf_load_table Save / Load a Spark DataFrame
sdf_ncol Support for Dimension Operations
sdf_nrow Support for Dimension Operations
sdf_num_partitions Gets number of partitions of a Spark DataFrame
sdf_partition Partition a Spark Dataframe
sdf_partition_sizes Compute the number of records within each partition of a Spark DataFrame
sdf_persist Persist a Spark DataFrame
sdf_pivot Pivot a Spark DataFrame
sdf_predict Spark ML - Transform, fit, and predict methods (sdf_ interface)
sdf_project Project features onto principal components
sdf_quantile Compute (Approximate) Quantiles with a Spark DataFrame
sdf_random_split Partition a Spark Dataframe
sdf_rbeta Generate random samples from a Beta distribution
sdf_rbinom Generate random samples from a binomial distribution
sdf_rcauchy Generate random samples from a Cauchy distribution
sdf_rchisq Generate random samples from a chi-squared distribution
sdf_read_column Read a Column from a Spark DataFrame
sdf_register Register a Spark DataFrame
sdf_repartition Repartition a Spark DataFrame
sdf_residuals Model Residuals
sdf_residuals.ml_model_generalized_linear_regression Model Residuals
sdf_residuals.ml_model_linear_regression Model Residuals
sdf_rexp Generate random samples from an exponential distribution
sdf_rgamma Generate random samples from a Gamma distribution
sdf_rgeom Generate random samples from a geometric distribution
sdf_rhyper Generate random samples from a hypergeometric distribution
sdf_rlnorm Generate random samples from a log normal distribution
sdf_rnorm Generate random samples from the standard normal distribution
sdf_rpois Generate random samples from a Poisson distribution
sdf_rt Generate random samples from a t-distribution
sdf_runif Generate random samples from the uniform distribution U(0, 1).
sdf_rweibull Generate random samples from a Weibull distribution.
sdf_sample Randomly Sample Rows from a Spark DataFrame
sdf_save_parquet Save / Load a Spark DataFrame
sdf_save_table Save / Load a Spark DataFrame
sdf_schema Read the Schema of a Spark DataFrame
sdf_separate_column Separate a Vector Column into Scalar Columns
sdf_seq Create DataFrame for Range
sdf_sort Sort a Spark DataFrame
sdf_sql Spark DataFrame from SQL
sdf_to_avro Convert column(s) to avro format
sdf_transform Spark ML - Transform, fit, and predict methods (sdf_ interface)
sdf_unnest_longer Unnest longer
sdf_unnest_wider Unnest wider
sdf_weighted_sample Perform Weighted Random Sampling on a Spark DataFrame
sdf_with_sequential_id Add a Sequential ID Column to a Spark DataFrame
sdf_with_unique_id Add a Unique ID Column to a Spark DataFrame
select Select
separate Separate
spark-api Access the Spark API
spark-connections Manage Spark Connections
sparklyr_get_backend_port Return the port number of a 'sparklyr' backend.
spark_adaptive_query_execution Retrieves or sets status of Spark AQE
spark_advisory_shuffle_partition_size Retrieves or sets advisory size of the shuffle partition
spark_apply Apply an R Function in Spark
spark_apply_bundle Create Bundle for Spark Apply
spark_apply_log Log Writer for Spark Apply
spark_auto_broadcast_join_threshold Retrieves or sets the auto broadcast join threshold
spark_available_versions Download and install various versions of Spark
spark_coalesce_initial_num_partitions Retrieves or sets initial number of shuffle partitions before coalescing
spark_coalesce_min_num_partitions Retrieves or sets the minimum number of shuffle partitions after coalescing
spark_coalesce_shuffle_partitions Retrieves or sets whether coalescing contiguous shuffle partitions is enabled
spark_compilation_spec Define a Spark Compilation Specification
spark_config Read Spark Configuration
spark_config_kubernetes Kubernetes Configuration
spark_config_settings Retrieve Available Settings
spark_connect Manage Spark Connections
spark_connection Retrieve the Spark Connection Associated with an R Object
spark_connection-class spark_connection class
spark_connection_find Find Spark Connection
spark_connection_is_open Manage Spark Connections
spark_connect_method Function that negotiates the connection with the Spark back-end
spark_context Access the Spark API
spark_context_config Runtime configuration interface for the Spark Context.
spark_dataframe Retrieve a Spark DataFrame
spark_default_compilation_spec Default Compilation Specification for Spark Extensions
spark_dependency Define a Spark dependency
spark_dependency_fallback Fallback to Spark Dependency
spark_disconnect Manage Spark Connections
spark_disconnect_all Manage Spark Connections
spark_extension Create Spark Extension
spark_get_checkpoint_dir Set/Get Spark checkpoint directory
spark_home_set Set the SPARK_HOME environment variable
spark_ide_columns Set of functions to provide integration with the RStudio IDE
spark_ide_connection_actions Set of functions to provide integration with the RStudio IDE
spark_ide_connection_closed Set of functions to provide integration with the RStudio IDE
spark_ide_connection_open Set of functions to provide integration with the RStudio IDE
spark_ide_connection_updated Set of functions to provide integration with the RStudio IDE
spark_ide_objects Set of functions to provide integration with the RStudio IDE
spark_ide_preview Set of functions to provide integration with the RStudio IDE
spark_insert_table Inserts a Spark DataFrame into a Spark table
spark_install Download and install various versions of Spark
spark_installed_versions Download and install various versions of Spark
spark_install_dir Download and install various versions of Spark
spark_install_tar Download and install various versions of Spark
spark_integ_test_skip It lets the package know if it should test a particular functionality or not
spark_jobj Retrieve a Spark JVM Object Reference
spark_jobj-class spark_jobj class
spark_last_error Surfaces the last error from Spark captured by internal 'spark_error' function
spark_load_table Reads from a Spark Table into a Spark DataFrame.
spark_log View Entries in the Spark Log
spark_read Read file(s) into a Spark DataFrame using a custom reader
spark_read_avro Read Apache Avro data into a Spark DataFrame.
spark_read_binary Read binary data into a Spark DataFrame.
spark_read_csv Read a CSV file into a Spark DataFrame
spark_read_delta Read from Delta Lake into a Spark DataFrame.
spark_read_image Read image data into a Spark DataFrame.
spark_read_jdbc Read from JDBC connection into a Spark DataFrame.
spark_read_json Read a JSON file into a Spark DataFrame
spark_read_libsvm Read libsvm file into a Spark DataFrame.
spark_read_orc Read a ORC file into a Spark DataFrame
spark_read_parquet Read a Parquet file into a Spark DataFrame
spark_read_source Read from a generic source into a Spark DataFrame.
spark_read_table Reads from a Spark Table into a Spark DataFrame.
spark_read_text Read a Text file into a Spark DataFrame
spark_save_table Saves a Spark DataFrame as a Spark table
spark_session Access the Spark API
spark_session_config Runtime configuration interface for the Spark Session
spark_set_checkpoint_dir Set/Get Spark checkpoint directory
spark_statistical_routines Generate random samples from some distribution
spark_submit Manage Spark Connections
spark_table_name Generate a Table Name from Expression
spark_uninstall Download and install various versions of Spark
spark_version Get the Spark Version Associated with a Spark Connection
spark_version_from_home Get the Spark Version Associated with a Spark Installation
spark_web Open the Spark web interface
spark_write Write Spark DataFrame to file using a custom writer
spark_write_avro Serialize a Spark DataFrame into Apache Avro format
spark_write_csv Write a Spark DataFrame to a CSV
spark_write_delta Writes a Spark DataFrame into Delta Lake
spark_write_jdbc Writes a Spark DataFrame into a JDBC table
spark_write_json Write a Spark DataFrame to a JSON file
spark_write_orc Write a Spark DataFrame to a ORC file
spark_write_parquet Write a Spark DataFrame to a Parquet file
spark_write_rds Write Spark DataFrame to RDS files
spark_write_source Writes a Spark DataFrame into a generic source
spark_write_table Writes a Spark DataFrame into a Spark table
spark_write_text Write a Spark DataFrame to a Text file
src_databases Show database list
stream_find Find Stream
stream_generate_test Generate Test Stream
stream_id Spark Stream's Identifier
stream_lag Apply lag function to columns of a Spark Streaming DataFrame
stream_name Spark Stream's Name
stream_read_cloudfiles Read files created by the stream
stream_read_csv Read files created by the stream
stream_read_delta Read files created by the stream
stream_read_json Read files created by the stream
stream_read_kafka Read files created by the stream
stream_read_orc Read files created by the stream
stream_read_parquet Read files created by the stream
stream_read_socket Read files created by the stream
stream_read_table Read files created by the stream
stream_read_text Read files created by the stream
stream_render Render Stream
stream_stats Stream Statistics
stream_stop Stops a Spark Stream
stream_trigger_continuous Spark Stream Continuous Trigger
stream_trigger_interval Spark Stream Interval Trigger
stream_view View Stream
stream_watermark Watermark Stream
stream_write_console Write files to the stream
stream_write_csv Write files to the stream
stream_write_delta Write files to the stream
stream_write_json Write files to the stream
stream_write_kafka Write files to the stream
stream_write_memory Write Memory Stream
stream_write_orc Write files to the stream
stream_write_parquet Write files to the stream
stream_write_table Write Stream to Table
stream_write_text Write files to the stream

-- T --

tbl_cache Cache a Spark Table
tbl_change_db Use specific database
tbl_uncache Uncache a Spark Table
tidy.ml_model_aft_survival_regression Tidying methods for Spark ML Survival Regression
tidy.ml_model_als Tidying methods for Spark ML ALS
tidy.ml_model_bisecting_kmeans Tidying methods for Spark ML unsupervised models
tidy.ml_model_decision_tree_classification Tidying methods for Spark ML tree models
tidy.ml_model_decision_tree_regression Tidying methods for Spark ML tree models
tidy.ml_model_gaussian_mixture Tidying methods for Spark ML unsupervised models
tidy.ml_model_gbt_classification Tidying methods for Spark ML tree models
tidy.ml_model_gbt_regression Tidying methods for Spark ML tree models
tidy.ml_model_generalized_linear_regression Tidying methods for Spark ML linear models
tidy.ml_model_isotonic_regression Tidying methods for Spark ML Isotonic Regression
tidy.ml_model_kmeans Tidying methods for Spark ML unsupervised models
tidy.ml_model_lda Tidying methods for Spark ML LDA models
tidy.ml_model_linear_regression Tidying methods for Spark ML linear models
tidy.ml_model_linear_svc Tidying methods for Spark ML linear svc
tidy.ml_model_logistic_regression Tidying methods for Spark ML Logistic Regression
tidy.ml_model_multilayer_perceptron_classification Tidying methods for Spark ML MLP
tidy.ml_model_naive_bayes Tidying methods for Spark ML Naive Bayes
tidy.ml_model_pca Tidying methods for Spark ML Principal Component Analysis
tidy.ml_model_random_forest_classification Tidying methods for Spark ML tree models
tidy.ml_model_random_forest_regression Tidying methods for Spark ML tree models
transform_sdf transform a subset of column(s) in a Spark Dataframe

-- U --

unite Unite
unnest Unnest

-- misc --

%->% Infix operator for composing a lambda expression
[.tbl_spark Subsetting operator for Spark dataframe