R: Compare Machine Learning Models

mlm_init {stressor}

R Documentation

Compare Machine Learning Models

Description

Through the PyCaret module from 'python', this function fits many machine learning models simultaneously without requiring any 'python' programming on the part of the user. This is the core function to fitting the initial models. This function is the backbone to fitting all the models.

Usage

mlm_init(
  formula,
  train_data,
  fit_models,
  sort_v = NULL,
  n_models = 9999,
  classification = FALSE,
  seed = NULL,
  ...
)

Arguments

formula

The regression formula or classification formula. This formula should be linear.

train_data

A data.frame object that includes data to be trained on.

fit_models

A character vector with all the possible Machine Learning regressors that are currently being fit. The user may specify a subset of them using a character vector.

ada	AdaBoost Regressor
br	Bayesian Ridge
dt	Decision Tree Regressor
dummy	Dummy Regressor
en	Elastic Net
et	Extra Trees Regressor
gbr	Gradient Boosting Regressor
huber	Huber Regressor
knn	K Neighbors Regressor
lar	Least Angle Regression
lasso	Lasso Regression
lightgbm	Light Gradient Boosting Machine
llar	Lasso Least Angle Regression
lr	Linear Regression
omp	Orthogonal Matching Pursuit
par	Passive Aggressive Regressor
rf	Random Forest Regressor
ridge	Ridge Regression

If classification is set to 'TRUE', these models can be used depending on user. These are the default values for classification:

ada	AdaBoost Classifier
dt	Decision Tree Classifier
dummy	Dummy Classifier
et	Extra Trees Classifier
gbc	Gradient Boosting Classifier
knn	K Neighbors Classifier
lda	Linear Discriminant Analysis
lightgbm	Light Gradient Boosting Machine
lr	Logistic Regression
nb	Naive Bayes
qda	Quadratic Discriminant Analysis
rf	Random Forest Classifier
ridge	Ridge Classifier
svm	SVM - Linear Kernel

sort_v

A character vector indicating what to sort the tuned models on. Default value is 'NULL'.

n_models

A defaulted integer to return the maximum number of models.

classification

A Boolean value tag to indicate if classification methods should be used.

seed

An integer value to set the seed of the python environment. Default value is set to 'NULL'.

...

Additional arguments passed to the setup function in 'PyCaret'.

Details

The formula should be linear. However, that does not imply a linear fit. The formula is a convenient way to separate predictor variables from explanatory variables.

'PyCaret' is a 'python' module where machine learning models can be fitted with little coding by the user. The pipeline that 'PyCaret' uses has a setup function to parameterize the data that is easy for all the models to fit on. Then compare models function is executed which fits all the models that are currently available. This process takes less than five minutes for data.frame objects that are less than 10,000 rows.

Value

A list object that contains all the fitted models and the CV predictive accuracy. With a class attribute of '"mlm_stressor"'.

Examples


 lm_test <- data_gen_lm(20)
 create_virtualenv()
 mlm_lm <- mlm_regressor(Y ~ ., lm_test)

[Package stressor version 0.2.0 Index]