bootstrap_performance | Calculate a bootstrap confidence interval for the performance on a single train/test split |
calc_balanced_precision | Calculate balanced precision given actual and baseline precision |
calc_baseline_precision | Calculate the fraction of positives, i.e. baseline precision for a PRC curve |
calc_mean_perf | Generic function to calculate mean performance curves for multiple models |
calc_mean_prc | Calculate and summarize performance for ROC and PRC plots |
calc_mean_roc | Calculate and summarize performance for ROC and PRC plots |
calc_model_sensspec | Calculate and summarize performance for ROC and PRC plots |
calc_perf_metrics | Get performance metrics for test data |
combine_hp_performance | Combine hyperparameter performance metrics for multiple train/test splits |
compare_models | Perform permutation tests to compare the performance metric across all pairs of a group variable. |
define_cv | Define cross-validation scheme and training parameters |
get_caret_processed_df | Get preprocessed dataframe for continuous variables |
get_feature_importance | Get feature importance using the permutation method |
get_hp_performance | Get hyperparameter performance metrics |
get_hyperparams_list | Set hyperparameters based on ML method and dataset characteristics |
get_outcome_type | Get outcome type. |
get_partition_indices | Select indices to partition the data into training & testing sets. |
get_performance_tbl | Get model performance metrics as a one-row tibble |
get_perf_metric_fn | Get default performance metric function |
get_perf_metric_name | Get default performance metric name |
get_tuning_grid | Generate the tuning grid for tuning hyperparameters |
group_correlated_features | Group correlated features |
otu_data_preproc | Mini OTU abundance dataset - preprocessed |
otu_mini_bin | Mini OTU abundance dataset |
otu_mini_bin_results_glmnet | Results from running the pipeline with L2 logistic regression on 'otu_mini_bin' with feature importance and grouping |
otu_mini_bin_results_rf | Results from running the pipeline with random forest on 'otu_mini_bin' |
otu_mini_bin_results_rpart2 | Results from running the pipeline with rpart2 on 'otu_mini_bin' |
otu_mini_bin_results_svmRadial | Results from running the pipeline with svmRadial on 'otu_mini_bin' |
otu_mini_bin_results_xgbTree | Results from running the pipeline with xbgTree on 'otu_mini_bin' |
otu_mini_cont_results_glmnet | Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcome |
otu_mini_cont_results_nocv | Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcome column, using a custom train control scheme that does not perform cross-validation |
otu_mini_cv | Cross validation on 'train_data_mini' with grouped features. |
otu_mini_multi | Mini OTU abundance dataset with 3 categorical variables |
otu_mini_multi_group | Groups for otu_mini_multi |
otu_mini_multi_results_glmnet | Results from running the pipeline with glmnet on 'otu_mini_multi' for multiclass outcomes |
otu_small | Small OTU abundance dataset |
permute_p_value | Calculated a permuted p-value comparing two models |
plot_curves | Plot ROC and PRC curves |
plot_hp_performance | Plot hyperparameter performance metrics |
plot_mean_prc | Plot ROC and PRC curves |
plot_mean_roc | Plot ROC and PRC curves |
plot_model_performance | Plot performance metrics for multiple ML runs with different parameters |
preprocess_data | Preprocess data prior to running machine learning |
randomize_feature_order | Randomize feature order to eliminate any position-dependent effects |
remove_singleton_columns | Remove columns appearing in only 'threshold' row(s) or fewer. |
replace_spaces | Replace spaces in all elements of a character vector with underscores |
run_ml | Run the machine learning pipeline |
sensspec | Calculate and summarize performance for ROC and PRC plots |
tidy_perf_data | Tidy the performance dataframe |
train_model | Train model using 'caret::train()'. |