mlm_init {stressor}R Documentation

Compare Machine Learning Models

Description

Through the PyCaret module from 'python', this function fits many machine learning models simultaneously without requiring any 'python' programming on the part of the user. This is the core function to fitting the initial models. This function is the backbone to fitting all the models.

Usage

mlm_init(
  formula,
  train_data,
  fit_models,
  sort_v = NULL,
  n_models = 9999,
  classification = FALSE,
  seed = NULL,
  ...
)

Arguments

formula

The regression formula or classification formula. This formula should be linear.

train_data

A data.frame object that includes data to be trained on.

fit_models

A character vector with all the possible Machine Learning regressors that are currently being fit. The user may specify a subset of them using a character vector.

ada AdaBoost Regressor
br Bayesian Ridge
dt Decision Tree Regressor
dummy Dummy Regressor
en Elastic Net
et Extra Trees Regressor
gbr Gradient Boosting Regressor
huber Huber Regressor
knn K Neighbors Regressor
lar Least Angle Regression
lasso Lasso Regression
lightgbm Light Gradient Boosting Machine
llar Lasso Least Angle Regression
lr Linear Regression
omp Orthogonal Matching Pursuit
par Passive Aggressive Regressor
rf Random Forest Regressor
ridge Ridge Regression

If classification is set to 'TRUE', these models can be used depending on user. These are the default values for classification:

ada AdaBoost Classifier
dt Decision Tree Classifier
dummy Dummy Classifier
et Extra Trees Classifier
gbc Gradient Boosting Classifier
knn K Neighbors Classifier
lda Linear Discriminant Analysis
lightgbm Light Gradient Boosting Machine
lr Logistic Regression
nb Naive Bayes
qda Quadratic Discriminant Analysis
rf Random Forest Classifier
ridge Ridge Classifier
svm SVM - Linear Kernel
sort_v

A character vector indicating what to sort the tuned models on. Default value is 'NULL'.

n_models

A defaulted integer to return the maximum number of models.

classification

A Boolean value tag to indicate if classification methods should be used.

seed

An integer value to set the seed of the python environment. Default value is set to 'NULL'.

...

Additional arguments passed to the setup function in 'PyCaret'.

Details

The formula should be linear. However, that does not imply a linear fit. The formula is a convenient way to separate predictor variables from explanatory variables.

'PyCaret' is a 'python' module where machine learning models can be fitted with little coding by the user. The pipeline that 'PyCaret' uses has a setup function to parameterize the data that is easy for all the models to fit on. Then compare models function is executed which fits all the models that are currently available. This process takes less than five minutes for data.frame objects that are less than 10,000 rows.

Value

A list object that contains all the fitted models and the CV predictive accuracy. With a class attribute of '"mlm_stressor"'.

Examples


 lm_test <- data_gen_lm(20)
 create_virtualenv()
 mlm_lm <- mlm_regressor(Y ~ ., lm_test)


[Package stressor version 0.2.0 Index]