A Machine-Learning Based Tool to Automate the Identification of Biological Database IDs

Documentation for package ‘MantaID’ version 1.0.2

mi	A wrapper function that executes MantaID workflow.
mi_balance_data	Data balance. Most classes adopt random undersampling, while a few classes adopt smote method to oversample to obtain relatively balanced data;
mi_clean_data	Reshape data and delete meaningless rows.
mi_data_attributes	ID-related datasets in biomart.
mi_data_procID	Processed ID data.
mi_data_rawID	ID dataset for testing.
mi_get_confusion	Compute the confusion matrix for the predict result.
mi_get_ID	Get ID data from 'Biomart' database use 'attributes'.
mi_get_ID_attr	Get ID attributes from 'Biomart' database.
mi_get_miss	Observe the distribution of the false response of test set.
mi_get_padlen	Get max length of ID data.
mi_plot_cor	Plot correlation heatmap.
mi_plot_heatmap	Plot heatmap for result confusion matrix.
mi_predict_new	Predict new data with trained learner.
mi_run_bmr	Compare classification models with small samples.
mi_split_col	Cut the string of ID column character by character and divide it into multiple columns.
mi_split_str	Split the string into individual characters and complete the character vector to the maximum length.
mi_to_numer	Convert data to numeric, and for ID column convert with fixed levels.
mi_train_BP	Train a three layers neural network model.
mi_train_rg	Random Forest Model Training.
mi_train_rp	Classification tree model training.
mi_train_xgb	Xgboost model training
mi_tune_rg	Tune Random Forest model by hyperband.
mi_tune_rp	Tune decision tree model by hyperband.
mi_tune_xgb	Tune Xgboost model by hyperband.