User-Friendly R Package for Supervised Machine Learning Pipelines

Documentation for package ‘mikropml’ version 1.6.1

Help Pages

bootstrap_performance	Calculate a bootstrap confidence interval for the performance on a single train/test split
calc_balanced_precision	Calculate balanced precision given actual and baseline precision
calc_baseline_precision	Calculate the fraction of positives, i.e. baseline precision for a PRC curve
calc_mean_perf	Generic function to calculate mean performance curves for multiple models
calc_mean_prc	Calculate and summarize performance for ROC and PRC plots
calc_mean_roc	Calculate and summarize performance for ROC and PRC plots
calc_model_sensspec	Calculate and summarize performance for ROC and PRC plots
calc_perf_metrics	Get performance metrics for test data
combine_hp_performance	Combine hyperparameter performance metrics for multiple train/test splits
compare_models	Perform permutation tests to compare the performance metric across all pairs of a group variable.
define_cv	Define cross-validation scheme and training parameters
get_caret_processed_df	Get preprocessed dataframe for continuous variables
get_feature_importance	Get feature importance using the permutation method
get_hp_performance	Get hyperparameter performance metrics
get_hyperparams_list	Set hyperparameters based on ML method and dataset characteristics
get_outcome_type	Get outcome type.
get_partition_indices	Select indices to partition the data into training & testing sets.
get_performance_tbl	Get model performance metrics as a one-row tibble
get_perf_metric_fn	Get default performance metric function
get_perf_metric_name	Get default performance metric name
get_tuning_grid	Generate the tuning grid for tuning hyperparameters
group_correlated_features	Group correlated features
otu_data_preproc	Mini OTU abundance dataset - preprocessed
otu_mini_bin	Mini OTU abundance dataset
otu_mini_bin_results_glmnet	Results from running the pipeline with L2 logistic regression on 'otu_mini_bin' with feature importance and grouping
otu_mini_bin_results_rf	Results from running the pipeline with random forest on 'otu_mini_bin'
otu_mini_bin_results_rpart2	Results from running the pipeline with rpart2 on 'otu_mini_bin'
otu_mini_bin_results_svmRadial	Results from running the pipeline with svmRadial on 'otu_mini_bin'
otu_mini_bin_results_xgbTree	Results from running the pipeline with xbgTree on 'otu_mini_bin'
otu_mini_cont_results_glmnet	Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcome
otu_mini_cont_results_nocv	Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcome column, using a custom train control scheme that does not perform cross-validation
otu_mini_cv	Cross validation on 'train_data_mini' with grouped features.
otu_mini_multi	Mini OTU abundance dataset with 3 categorical variables
otu_mini_multi_group	Groups for otu_mini_multi
otu_mini_multi_results_glmnet	Results from running the pipeline with glmnet on 'otu_mini_multi' for multiclass outcomes
otu_small	Small OTU abundance dataset
permute_p_value	Calculated a permuted p-value comparing two models
plot_curves	Plot ROC and PRC curves
plot_hp_performance	Plot hyperparameter performance metrics
plot_mean_prc	Plot ROC and PRC curves
plot_mean_roc	Plot ROC and PRC curves
plot_model_performance	Plot performance metrics for multiple ML runs with different parameters
preprocess_data	Preprocess data prior to running machine learning
randomize_feature_order	Randomize feature order to eliminate any position-dependent effects
remove_singleton_columns	Remove columns appearing in only 'threshold' row(s) or fewer.
replace_spaces	Replace spaces in all elements of a character vector with underscores
run_ml	Run the machine learning pipeline
sensspec	Calculate and summarize performance for ROC and PRC plots
tidy_perf_data	Tidy the performance dataframe
train_model	Train model using 'caret::train()'.