ml-params |
Spark ML - ML Params |
ml-persistence |
Spark ML - Model Persistence |
ml-transform-methods |
Spark ML - Transform, fit, and predict methods (ml_ interface) |
ml-tuning |
Spark ML - Tuning |
ml_aft_survival_regression |
Spark ML - Survival Regression |
ml_als |
Spark ML - ALS |
ml_als_tidiers |
Tidying methods for Spark ML ALS |
ml_approx_nearest_neighbors |
Utility functions for LSH models |
ml_approx_similarity_join |
Utility functions for LSH models |
ml_association_rules |
Frequent Pattern Mining - FPGrowth |
ml_binary_classification_eval |
Spark ML - Evaluators |
ml_binary_classification_evaluator |
Spark ML - Evaluators |
ml_bisecting_kmeans |
Spark ML - Bisecting K-Means Clustering |
ml_chisquare_test |
Chi-square hypothesis testing for categorical data. |
ml_classification_eval |
Spark ML - Evaluators |
ml_clustering_evaluator |
Spark ML - Clustering Evaluator |
ml_compute_cost |
Spark ML - K-Means Clustering |
ml_compute_silhouette_measure |
Spark ML - K-Means Clustering |
ml_corr |
Compute correlation matrix |
ml_cross_validator |
Spark ML - Tuning |
ml_decision_tree |
Spark ML - Decision Trees |
ml_decision_tree_classifier |
Spark ML - Decision Trees |
ml_decision_tree_regressor |
Spark ML - Decision Trees |
ml_default_stop_words |
Default stop words |
ml_describe_topics |
Spark ML - Latent Dirichlet Allocation |
ml_evaluate |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_evaluator |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_generalized_linear_regression_model |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_linear_regression_model |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_logistic_regression_model |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_model_classification |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_model_clustering |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_model_generalized_linear_regression |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_model_linear_regression |
Evaluate the Model on a Validation Set |
ml_evaluate.ml_model_logistic_regression |
Evaluate the Model on a Validation Set |
ml_evaluator |
Spark ML - Evaluators |
ml_feature_importances |
Spark ML - Feature Importance for Tree Models |
ml_find_synonyms |
Feature Transformation - Word2Vec (Estimator) |
ml_fit |
Spark ML - Transform, fit, and predict methods (ml_ interface) |
ml_fit.default |
Spark ML - Transform, fit, and predict methods (ml_ interface) |
ml_fit_and_transform |
Spark ML - Transform, fit, and predict methods (ml_ interface) |
ml_fpgrowth |
Frequent Pattern Mining - FPGrowth |
ml_freq_itemsets |
Frequent Pattern Mining - FPGrowth |
ml_freq_seq_patterns |
Frequent Pattern Mining - PrefixSpan |
ml_gaussian_mixture |
Spark ML - Gaussian Mixture clustering. |
ml_gbt_classifier |
Spark ML - Gradient Boosted Trees |
ml_gbt_regressor |
Spark ML - Gradient Boosted Trees |
ml_generalized_linear_regression |
Spark ML - Generalized Linear Regression |
ml_glm_tidiers |
Tidying methods for Spark ML linear models |
ml_gradient_boosted_trees |
Spark ML - Gradient Boosted Trees |
ml_isotonic_regression |
Spark ML - Isotonic Regression |
ml_isotonic_regression_tidiers |
Tidying methods for Spark ML Isotonic Regression |
ml_is_set |
Spark ML - ML Params |
ml_kmeans |
Spark ML - K-Means Clustering |
ml_kmeans_cluster_eval |
Evaluate a K-mean clustering |
ml_labels |
Feature Transformation - StringIndexer (Estimator) |
ml_lda |
Spark ML - Latent Dirichlet Allocation |
ml_lda_tidiers |
Tidying methods for Spark ML LDA models |
ml_linear_regression |
Spark ML - Linear Regression |
ml_linear_svc |
Spark ML - LinearSVC |
ml_linear_svc_tidiers |
Tidying methods for Spark ML linear svc |
ml_load |
Spark ML - Model Persistence |
ml_logistic_regression |
Spark ML - Logistic Regression |
ml_logistic_regression_tidiers |
Tidying methods for Spark ML Logistic Regression |
ml_log_likelihood |
Spark ML - Latent Dirichlet Allocation |
ml_log_perplexity |
Spark ML - Latent Dirichlet Allocation |
ml_metrics_binary |
Extracts metrics from a fitted table |
ml_metrics_multiclass |
Extracts metrics from a fitted table |
ml_metrics_regression |
Extracts metrics from a fitted table |
ml_model_data |
Extracts data associated with a Spark ML model |
ml_multiclass_classification_evaluator |
Spark ML - Evaluators |
ml_multilayer_perceptron |
Spark ML - Multilayer Perceptron |
ml_multilayer_perceptron_classifier |
Spark ML - Multilayer Perceptron |
ml_multilayer_perceptron_tidiers |
Tidying methods for Spark ML MLP |
ml_naive_bayes |
Spark ML - Naive-Bayes |
ml_naive_bayes_tidiers |
Tidying methods for Spark ML Naive Bayes |
ml_one_vs_rest |
Spark ML - OneVsRest |
ml_param |
Spark ML - ML Params |
ml_params |
Spark ML - ML Params |
ml_param_map |
Spark ML - ML Params |
ml_pca |
Feature Transformation - PCA (Estimator) |
ml_pca_tidiers |
Tidying methods for Spark ML Principal Component Analysis |
ml_pipeline |
Spark ML - Pipelines |
ml_power_iteration |
Spark ML - Power Iteration Clustering |
ml_predict |
Spark ML - Transform, fit, and predict methods (ml_ interface) |
ml_predict.ml_model_classification |
Spark ML - Transform, fit, and predict methods (ml_ interface) |
ml_prefixspan |
Frequent Pattern Mining - PrefixSpan |
ml_random_forest |
Spark ML - Random Forest |
ml_random_forest_classifier |
Spark ML - Random Forest |
ml_random_forest_regressor |
Spark ML - Random Forest |
ml_recommend |
Spark ML - ALS |
ml_regression_evaluator |
Spark ML - Evaluators |
ml_save |
Spark ML - Model Persistence |
ml_save.ml_model |
Spark ML - Model Persistence |
ml_stage |
Spark ML - Pipeline stage extraction |
ml_stages |
Spark ML - Pipeline stage extraction |
ml_sub_models |
Spark ML - Tuning |
ml_summary |
Spark ML - Extraction of summary metrics |
ml_survival_regression |
Spark ML - Survival Regression |
ml_survival_regression_tidiers |
Tidying methods for Spark ML Survival Regression |
ml_topics_matrix |
Spark ML - Latent Dirichlet Allocation |
ml_train_validation_split |
Spark ML - Tuning |
ml_transform |
Spark ML - Transform, fit, and predict methods (ml_ interface) |
ml_tree_feature_importance |
Spark ML - Feature Importance for Tree Models |
ml_tree_tidiers |
Tidying methods for Spark ML tree models |
ml_uid |
Spark ML - UID |
ml_unsupervised_tidiers |
Tidying methods for Spark ML unsupervised models |
ml_validation_metrics |
Spark ML - Tuning |
ml_vocabulary |
Feature Transformation - CountVectorizer (Estimator) |
mutate |
Mutate |
sdf-saveload |
Save / Load a Spark DataFrame |
sdf-transform-methods |
Spark ML - Transform, fit, and predict methods (sdf_ interface) |
sdf_along |
Create DataFrame for along Object |
sdf_bind |
Bind multiple Spark DataFrames by row and column |
sdf_bind_cols |
Bind multiple Spark DataFrames by row and column |
sdf_bind_rows |
Bind multiple Spark DataFrames by row and column |
sdf_broadcast |
Broadcast hint |
sdf_checkpoint |
Checkpoint a Spark DataFrame |
sdf_coalesce |
Coalesces a Spark DataFrame |
sdf_collect |
Collect a Spark DataFrame into R. |
sdf_copy_to |
Copy an Object into Spark |
sdf_crosstab |
Cross Tabulation |
sdf_debug_string |
Debug Info for Spark DataFrame |
sdf_describe |
Compute summary statistics for columns of a data frame |
sdf_dim |
Support for Dimension Operations |
sdf_distinct |
Invoke distinct on a Spark DataFrame |
sdf_drop_duplicates |
Remove duplicates from a Spark DataFrame |
sdf_expand_grid |
Create a Spark dataframe containing all combinations of inputs |
sdf_fit |
Spark ML - Transform, fit, and predict methods (sdf_ interface) |
sdf_fit_and_transform |
Spark ML - Transform, fit, and predict methods (sdf_ interface) |
sdf_from_avro |
Convert column(s) from avro format |
sdf_import |
Copy an Object into Spark |
sdf_is_streaming |
Spark DataFrame is Streaming |
sdf_last_index |
Returns the last index of a Spark DataFrame |
sdf_len |
Create DataFrame for Length |
sdf_load_parquet |
Save / Load a Spark DataFrame |
sdf_load_table |
Save / Load a Spark DataFrame |
sdf_ncol |
Support for Dimension Operations |
sdf_nrow |
Support for Dimension Operations |
sdf_num_partitions |
Gets number of partitions of a Spark DataFrame |
sdf_partition |
Partition a Spark Dataframe |
sdf_partition_sizes |
Compute the number of records within each partition of a Spark DataFrame |
sdf_persist |
Persist a Spark DataFrame |
sdf_pivot |
Pivot a Spark DataFrame |
sdf_predict |
Spark ML - Transform, fit, and predict methods (sdf_ interface) |
sdf_project |
Project features onto principal components |
sdf_quantile |
Compute (Approximate) Quantiles with a Spark DataFrame |
sdf_random_split |
Partition a Spark Dataframe |
sdf_rbeta |
Generate random samples from a Beta distribution |
sdf_rbinom |
Generate random samples from a binomial distribution |
sdf_rcauchy |
Generate random samples from a Cauchy distribution |
sdf_rchisq |
Generate random samples from a chi-squared distribution |
sdf_read_column |
Read a Column from a Spark DataFrame |
sdf_register |
Register a Spark DataFrame |
sdf_repartition |
Repartition a Spark DataFrame |
sdf_residuals |
Model Residuals |
sdf_residuals.ml_model_generalized_linear_regression |
Model Residuals |
sdf_residuals.ml_model_linear_regression |
Model Residuals |
sdf_rexp |
Generate random samples from an exponential distribution |
sdf_rgamma |
Generate random samples from a Gamma distribution |
sdf_rgeom |
Generate random samples from a geometric distribution |
sdf_rhyper |
Generate random samples from a hypergeometric distribution |
sdf_rlnorm |
Generate random samples from a log normal distribution |
sdf_rnorm |
Generate random samples from the standard normal distribution |
sdf_rpois |
Generate random samples from a Poisson distribution |
sdf_rt |
Generate random samples from a t-distribution |
sdf_runif |
Generate random samples from the uniform distribution U(0, 1). |
sdf_rweibull |
Generate random samples from a Weibull distribution. |
sdf_sample |
Randomly Sample Rows from a Spark DataFrame |
sdf_save_parquet |
Save / Load a Spark DataFrame |
sdf_save_table |
Save / Load a Spark DataFrame |
sdf_schema |
Read the Schema of a Spark DataFrame |
sdf_separate_column |
Separate a Vector Column into Scalar Columns |
sdf_seq |
Create DataFrame for Range |
sdf_sort |
Sort a Spark DataFrame |
sdf_sql |
Spark DataFrame from SQL |
sdf_to_avro |
Convert column(s) to avro format |
sdf_transform |
Spark ML - Transform, fit, and predict methods (sdf_ interface) |
sdf_unnest_longer |
Unnest longer |
sdf_unnest_wider |
Unnest wider |
sdf_weighted_sample |
Perform Weighted Random Sampling on a Spark DataFrame |
sdf_with_sequential_id |
Add a Sequential ID Column to a Spark DataFrame |
sdf_with_unique_id |
Add a Unique ID Column to a Spark DataFrame |
select |
Select |
separate |
Separate |
spark-api |
Access the Spark API |
spark-connections |
Manage Spark Connections |
sparklyr_get_backend_port |
Return the port number of a 'sparklyr' backend. |
spark_adaptive_query_execution |
Retrieves or sets status of Spark AQE |
spark_advisory_shuffle_partition_size |
Retrieves or sets advisory size of the shuffle partition |
spark_apply |
Apply an R Function in Spark |
spark_apply_bundle |
Create Bundle for Spark Apply |
spark_apply_log |
Log Writer for Spark Apply |
spark_auto_broadcast_join_threshold |
Retrieves or sets the auto broadcast join threshold |
spark_available_versions |
Download and install various versions of Spark |
spark_coalesce_initial_num_partitions |
Retrieves or sets initial number of shuffle partitions before coalescing |
spark_coalesce_min_num_partitions |
Retrieves or sets the minimum number of shuffle partitions after coalescing |
spark_coalesce_shuffle_partitions |
Retrieves or sets whether coalescing contiguous shuffle partitions is enabled |
spark_compilation_spec |
Define a Spark Compilation Specification |
spark_config |
Read Spark Configuration |
spark_config_kubernetes |
Kubernetes Configuration |
spark_config_settings |
Retrieve Available Settings |
spark_connect |
Manage Spark Connections |
spark_connection |
Retrieve the Spark Connection Associated with an R Object |
spark_connection-class |
spark_connection class |
spark_connection_find |
Find Spark Connection |
spark_connection_is_open |
Manage Spark Connections |
spark_connect_method |
Function that negotiates the connection with the Spark back-end |
spark_context |
Access the Spark API |
spark_context_config |
Runtime configuration interface for the Spark Context. |
spark_dataframe |
Retrieve a Spark DataFrame |
spark_default_compilation_spec |
Default Compilation Specification for Spark Extensions |
spark_dependency |
Define a Spark dependency |
spark_dependency_fallback |
Fallback to Spark Dependency |
spark_disconnect |
Manage Spark Connections |
spark_disconnect_all |
Manage Spark Connections |
spark_extension |
Create Spark Extension |
spark_get_checkpoint_dir |
Set/Get Spark checkpoint directory |
spark_home_set |
Set the SPARK_HOME environment variable |
spark_ide_columns |
Set of functions to provide integration with the RStudio IDE |
spark_ide_connection_actions |
Set of functions to provide integration with the RStudio IDE |
spark_ide_connection_closed |
Set of functions to provide integration with the RStudio IDE |
spark_ide_connection_open |
Set of functions to provide integration with the RStudio IDE |
spark_ide_connection_updated |
Set of functions to provide integration with the RStudio IDE |
spark_ide_objects |
Set of functions to provide integration with the RStudio IDE |
spark_ide_preview |
Set of functions to provide integration with the RStudio IDE |
spark_insert_table |
Inserts a Spark DataFrame into a Spark table |
spark_install |
Download and install various versions of Spark |
spark_installed_versions |
Download and install various versions of Spark |
spark_install_dir |
Download and install various versions of Spark |
spark_install_tar |
Download and install various versions of Spark |
spark_integ_test_skip |
It lets the package know if it should test a particular functionality or not |
spark_jobj |
Retrieve a Spark JVM Object Reference |
spark_jobj-class |
spark_jobj class |
spark_last_error |
Surfaces the last error from Spark captured by internal 'spark_error' function |
spark_load_table |
Reads from a Spark Table into a Spark DataFrame. |
spark_log |
View Entries in the Spark Log |
spark_read |
Read file(s) into a Spark DataFrame using a custom reader |
spark_read_avro |
Read Apache Avro data into a Spark DataFrame. |
spark_read_binary |
Read binary data into a Spark DataFrame. |
spark_read_csv |
Read a CSV file into a Spark DataFrame |
spark_read_delta |
Read from Delta Lake into a Spark DataFrame. |
spark_read_image |
Read image data into a Spark DataFrame. |
spark_read_jdbc |
Read from JDBC connection into a Spark DataFrame. |
spark_read_json |
Read a JSON file into a Spark DataFrame |
spark_read_libsvm |
Read libsvm file into a Spark DataFrame. |
spark_read_orc |
Read a ORC file into a Spark DataFrame |
spark_read_parquet |
Read a Parquet file into a Spark DataFrame |
spark_read_source |
Read from a generic source into a Spark DataFrame. |
spark_read_table |
Reads from a Spark Table into a Spark DataFrame. |
spark_read_text |
Read a Text file into a Spark DataFrame |
spark_save_table |
Saves a Spark DataFrame as a Spark table |
spark_session |
Access the Spark API |
spark_session_config |
Runtime configuration interface for the Spark Session |
spark_set_checkpoint_dir |
Set/Get Spark checkpoint directory |
spark_statistical_routines |
Generate random samples from some distribution |
spark_submit |
Manage Spark Connections |
spark_table_name |
Generate a Table Name from Expression |
spark_uninstall |
Download and install various versions of Spark |
spark_version |
Get the Spark Version Associated with a Spark Connection |
spark_version_from_home |
Get the Spark Version Associated with a Spark Installation |
spark_web |
Open the Spark web interface |
spark_write |
Write Spark DataFrame to file using a custom writer |
spark_write_avro |
Serialize a Spark DataFrame into Apache Avro format |
spark_write_csv |
Write a Spark DataFrame to a CSV |
spark_write_delta |
Writes a Spark DataFrame into Delta Lake |
spark_write_jdbc |
Writes a Spark DataFrame into a JDBC table |
spark_write_json |
Write a Spark DataFrame to a JSON file |
spark_write_orc |
Write a Spark DataFrame to a ORC file |
spark_write_parquet |
Write a Spark DataFrame to a Parquet file |
spark_write_rds |
Write Spark DataFrame to RDS files |
spark_write_source |
Writes a Spark DataFrame into a generic source |
spark_write_table |
Writes a Spark DataFrame into a Spark table |
spark_write_text |
Write a Spark DataFrame to a Text file |
src_databases |
Show database list |
stream_find |
Find Stream |
stream_generate_test |
Generate Test Stream |
stream_id |
Spark Stream's Identifier |
stream_lag |
Apply lag function to columns of a Spark Streaming DataFrame |
stream_name |
Spark Stream's Name |
stream_read_cloudfiles |
Read files created by the stream |
stream_read_csv |
Read files created by the stream |
stream_read_delta |
Read files created by the stream |
stream_read_json |
Read files created by the stream |
stream_read_kafka |
Read files created by the stream |
stream_read_orc |
Read files created by the stream |
stream_read_parquet |
Read files created by the stream |
stream_read_socket |
Read files created by the stream |
stream_read_table |
Read files created by the stream |
stream_read_text |
Read files created by the stream |
stream_render |
Render Stream |
stream_stats |
Stream Statistics |
stream_stop |
Stops a Spark Stream |
stream_trigger_continuous |
Spark Stream Continuous Trigger |
stream_trigger_interval |
Spark Stream Interval Trigger |
stream_view |
View Stream |
stream_watermark |
Watermark Stream |
stream_write_console |
Write files to the stream |
stream_write_csv |
Write files to the stream |
stream_write_delta |
Write files to the stream |
stream_write_json |
Write files to the stream |
stream_write_kafka |
Write files to the stream |
stream_write_memory |
Write Memory Stream |
stream_write_orc |
Write files to the stream |
stream_write_parquet |
Write files to the stream |
stream_write_table |
Write Stream to Table |
stream_write_text |
Write files to the stream |