bootstrap_performance |
Calculate a bootstrap confidence interval for the performance on a single train/test split |
calc_balanced_precision |
Calculate balanced precision given actual and baseline precision |
calc_baseline_precision |
Calculate the fraction of positives, i.e. baseline precision for a PRC curve |
calc_mean_perf |
Generic function to calculate mean performance curves for multiple models |
calc_mean_prc |
Calculate and summarize performance for ROC and PRC plots |
calc_mean_roc |
Calculate and summarize performance for ROC and PRC plots |
calc_model_sensspec |
Calculate and summarize performance for ROC and PRC plots |
calc_perf_metrics |
Get performance metrics for test data |
combine_hp_performance |
Combine hyperparameter performance metrics for multiple train/test splits |
compare_models |
Perform permutation tests to compare the performance metric across all pairs of a group variable. |
define_cv |
Define cross-validation scheme and training parameters |
get_caret_processed_df |
Get preprocessed dataframe for continuous variables |
get_feature_importance |
Get feature importance using the permutation method |
get_hp_performance |
Get hyperparameter performance metrics |
get_hyperparams_list |
Set hyperparameters based on ML method and dataset characteristics |
get_outcome_type |
Get outcome type. |
get_partition_indices |
Select indices to partition the data into training & testing sets. |
get_performance_tbl |
Get model performance metrics as a one-row tibble |
get_perf_metric_fn |
Get default performance metric function |
get_perf_metric_name |
Get default performance metric name |
get_tuning_grid |
Generate the tuning grid for tuning hyperparameters |
group_correlated_features |
Group correlated features |
otu_data_preproc |
Mini OTU abundance dataset - preprocessed |
otu_mini_bin |
Mini OTU abundance dataset |
otu_mini_bin_results_glmnet |
Results from running the pipeline with L2 logistic regression on 'otu_mini_bin' with feature importance and grouping |
otu_mini_bin_results_rf |
Results from running the pipeline with random forest on 'otu_mini_bin' |
otu_mini_bin_results_rpart2 |
Results from running the pipeline with rpart2 on 'otu_mini_bin' |
otu_mini_bin_results_svmRadial |
Results from running the pipeline with svmRadial on 'otu_mini_bin' |
otu_mini_bin_results_xgbTree |
Results from running the pipeline with xbgTree on 'otu_mini_bin' |
otu_mini_cont_results_glmnet |
Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcome |
otu_mini_cont_results_nocv |
Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcome column, using a custom train control scheme that does not perform cross-validation |
otu_mini_cv |
Cross validation on 'train_data_mini' with grouped features. |
otu_mini_multi |
Mini OTU abundance dataset with 3 categorical variables |
otu_mini_multi_group |
Groups for otu_mini_multi |
otu_mini_multi_results_glmnet |
Results from running the pipeline with glmnet on 'otu_mini_multi' for multiclass outcomes |
otu_small |
Small OTU abundance dataset |
permute_p_value |
Calculated a permuted p-value comparing two models |
plot_curves |
Plot ROC and PRC curves |
plot_hp_performance |
Plot hyperparameter performance metrics |
plot_mean_prc |
Plot ROC and PRC curves |
plot_mean_roc |
Plot ROC and PRC curves |
plot_model_performance |
Plot performance metrics for multiple ML runs with different parameters |
preprocess_data |
Preprocess data prior to running machine learning |
randomize_feature_order |
Randomize feature order to eliminate any position-dependent effects |
remove_singleton_columns |
Remove columns appearing in only 'threshold' row(s) or fewer. |
replace_spaces |
Replace spaces in all elements of a character vector with underscores |
run_ml |
Run the machine learning pipeline |
sensspec |
Calculate and summarize performance for ROC and PRC plots |
tidy_perf_data |
Tidy the performance dataframe |
train_model |
Train model using 'caret::train()'. |