| LMTrainer {superml} | R Documentation |
Linear Models Trainer
Description
Trains regression, lasso, ridge model in R
Details
Trains linear models such as Logistic, Lasso or Ridge regression model. It is built on glmnet R package. This class provides fit, predict, cross valdidation functions.
Public fields
familytype of regression to perform, values can be "gaussian" ,"binomial", "multinomial","mgaussian"
weightsobservation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation
alphaThe elasticnet mixing parameter, alpha=1 is the lasso penalty, alpha=0 the ridge penalty, alpha=NULL is simple regression
lambdathe number of lambda values - default is 100
standardizenormalise the features in the given data
standardize.responsenormalise the dependent variable between 0 and 1, default = FALSE
modelinternal use
cvmodelinternal use
Flaginternal use
is_lassointernal use
iid_namesinternal use
Methods
Public methods
Method new()
Usage
LMTrainer$new(family, weights, alpha, lambda, standardize.response)
Arguments
familycharacter, type of regression to perform, values can be "gaussian" ,"binomial", "multinomial","mgaussian"
weightsnumeric, observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation
alphainteger, The elasticnet mixing parameter, alpha=1 is the lasso penalty, alpha=0 the ridge penalty, alpha=NULL is simple regression
lambdainteger, the number of lambda values - default is 100
standardize.responselogical, normalise the dependent variable between 0 and 1, default = FALSE
Details
Create a new 'LMTrainer' object.
Returns
A 'LMTrainer' object.
Examples
\dontrun{
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
}
Method fit()
Usage
LMTrainer$fit(X, y)
Arguments
Xdata.frame containing train featuers
ycharacter, name of target variable
Details
Fits the LMTrainer model on given data
Returns
NULL, train the model and saves internally
Examples
\dontrun{
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$fit(X = housing, y = 'MEDV')
}
Method predict()
Usage
LMTrainer$predict(df, lambda = NULL)
Arguments
dfdata.frame containing test features
lambdainteger, the number of lambda values - default is 100. By default it picks the best value from the model.
Details
Returns predictions for test data
Returns
vector, a vector containing predictions
Examples
\dontrun{
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$fit(X = housing, y = 'MEDV')
predictions <- lf$cv_predict(df = housing)
}
Method cv_model()
Usage
LMTrainer$cv_model(X, y, nfolds, parallel, type.measure = "deviance")
Arguments
Xdata.frame containing test features
ycharacter, name of target variable
nfoldsinteger, number of folds
parallellogical, if do parallel computation. Default=FALSE
type.measurecharacter, evaluation metric type. Default = deviance
Details
Train regression model using cross validation
Returns
NULL, trains the model and saves it in memory
Examples
\dontrun{
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$cv_model(X = housing, y = 'MEDV', nfolds = 5, parallel = FALSE)
}
Method cv_predict()
Usage
LMTrainer$cv_predict(df, lambda = NULL)
Arguments
dfdata.frame containing test features
lambdainteger, the number of lambda values - default is 100. By default it picks the best value from the model.
Details
Get predictions from the cross validated regression model
Returns
vector a vector containing predicted values
Examples
\dontrun{
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$cv_model(X = housing, y = 'MEDV', nfolds = 5, parallel = FALSE)
predictions <- lf$cv_predict(df = housing)
}
Method get_importance()
Usage
LMTrainer$get_importance()
Details
Get feature importance using model coefficients
Returns
a matrix containing feature coefficients
Examples
\dontrun{
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$cv_model(X = housing, y = 'MEDV', nfolds = 5, parallel = FALSE)
predictions <- lf$cv_predict(df = housing)
coefs <- lf$get_importance()
}
Method clone()
The objects of this class are cloneable with this method.
Usage
LMTrainer$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Examples
## ------------------------------------------------
## Method `LMTrainer$new`
## ------------------------------------------------
## Not run:
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
## End(Not run)
## ------------------------------------------------
## Method `LMTrainer$fit`
## ------------------------------------------------
## Not run:
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$fit(X = housing, y = 'MEDV')
## End(Not run)
## ------------------------------------------------
## Method `LMTrainer$predict`
## ------------------------------------------------
## Not run:
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$fit(X = housing, y = 'MEDV')
predictions <- lf$cv_predict(df = housing)
## End(Not run)
## ------------------------------------------------
## Method `LMTrainer$cv_model`
## ------------------------------------------------
## Not run:
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$cv_model(X = housing, y = 'MEDV', nfolds = 5, parallel = FALSE)
## End(Not run)
## ------------------------------------------------
## Method `LMTrainer$cv_predict`
## ------------------------------------------------
## Not run:
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$cv_model(X = housing, y = 'MEDV', nfolds = 5, parallel = FALSE)
predictions <- lf$cv_predict(df = housing)
## End(Not run)
## ------------------------------------------------
## Method `LMTrainer$get_importance`
## ------------------------------------------------
## Not run:
LINK <- "http://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data"
housing <- read.table(LINK)
names <- c("CRIM","ZN","INDUS","CHAS","NOX","RM","AGE","DIS",
"RAD","TAX","PTRATIO","B","LSTAT","MEDV")
names(housing) <- names
lf <- LMTrainer$new(family = 'gaussian', alpha=1)
lf$cv_model(X = housing, y = 'MEDV', nfolds = 5, parallel = FALSE)
predictions <- lf$cv_predict(df = housing)
coefs <- lf$get_importance()
## End(Not run)