predict.cv.grpnet {grpnet}R Documentation

Predict Method for cv.grpnet Fits

Description

Obtain predictions from a cross-validated group elastic net regularized GLM (cv.grpnet) object.

Usage

## S3 method for class 'cv.grpnet'
predict(object, 
        newx,
        newdata,
        s = c("lambda.min", "lambda.1se"),
        type = c("link", "response", "class", "terms", 
                 "importance", "coefficients", "nonzero", "groups", 
                 "ncoefs", "ngroups", "norm", "znorm"),
        ...)

Arguments

object

Object of class "cv.grpnet"

newx

Matrix of new x scores for prediction (default S3 method). Must have p columns arranged in the same order as the x matrix used to fit the model.

newdata

Data frame of new data scores for prediction (S3 "formula" method). Must contain all variables in the formula used to fit the model.

s

Lambda value(s) at which predictions should be obtained. Can input a character ("lambda.min" or "lambda.1se") or a numeric vector. Default of "lambda.min" uses the lambda value that minimizes the mean cross-validated error.

type

Type of prediction to return. "link" gives predictions on the link scale (\eta). "response" gives predictions on the mean scale (\mu). "terms" gives the predictions for each term (group) in the model (\eta_k). "class" gives predicted class labels (for "binomial" and "multinomial" families). "coefficients" returns the coefficients used for predictions. "nonzero" returns a list giving the indices of non-zero coefficients for each s. "ncoefs" returns the number of non-zero coefficients for each s. "ngroups" returns the number of non-zero groups for each s. "norm" returns the L2 norm of each group's (raw) coefficients for each s. "znorm" returns the L2 norm of each group's standardized coefficients for each s.

...

Additional arguments (ignored)

Details

Predictions are calculated from the grpnet object fit to the full sample of data, which is stored as object$grpnet.fit

See predict.grpnet for further details on the calculation of the different types of predictions.

Value

Depends on three factors...
1. the exponential family distribution
2. the length of the input s
3. the type of prediction requested

See predict.grpnet for details

Note

Syntax is inspired by the predict.cv.glmnet function in the glmnet package (Friedman, Hastie, & Tibshirani, 2010).

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22. doi:10.18637/jss.v033.i01

See Also

cv.grpnet for k-fold cross-validation of lambda

predict.grpnet for predicting from grpnet objects

Examples

######***######   family = "gaussian"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = mpg)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto, alpha = 1)

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto)

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, s = "lambda.1se")

# compare rmse for two solutions
sqrt(mean((auto$mpg - fit.min)^2))
sqrt(mean((auto$mpg - fit.1se)^2))




######***######   family = "binomial"   ######***######

# load data
data(auto)

# define response (1 = American, 0 = other)
y <- ifelse(auto$origin == "American", 1, 0)

# define predictors
x <- rk.model.matrix(~ 0 + ., data = auto[,1:7])

# define group
g <- attr(x, "assign")

# 10-fold cv (default method, response = y)
set.seed(1)
mod <- cv.grpnet(x, y, g, family = "binomial", alpha = 1)

# get fitted values at "lambda.min"
fit.min <- predict(mod, newx = x, type = "response")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newx = x, type = "response", s = "lambda.1se")

# compare rmse for two solutions
sqrt(mean((y - fit.min)^2))
sqrt(mean((y - fit.1se)^2))

# get predicted classes at "lambda.min"
fit.min <- predict(mod, newx = x, type = "class")

# get predicted classes at "lambda.1se"
fit.1se <- predict(mod, newx = x, type = "class", s = "lambda.1se")

# compare misclassification rate for two solutions
1 - mean(y == fit.min)
1 - mean(y == fit.1se)



######***######   family = "poisson"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = horsepower)
set.seed(1)
mod <- cv.grpnet(horsepower ~ ., data = auto, family = "poisson", alpha = 1)

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response", s = "lambda.1se")

# compare rmse for two solutions
sqrt(mean((auto$horsepower - fit.min)^2))
sqrt(mean((auto$horsepower - fit.1se)^2))



######***######   family = "negative.binomial"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = horsepower)
set.seed(1)
mod <- cv.grpnet(horsepower ~ ., data = auto, family = "negative.binomial", 
                 alpha = 1, theta = 100)

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response", s = "lambda.1se")

# compare rmse for two solutions
sqrt(mean((auto$horsepower - fit.min)^2))
sqrt(mean((auto$horsepower - fit.1se)^2))



######***######   family = "multinomial"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin)
set.seed(1)
mod <- cv.grpnet(origin ~ ., data = auto, family = "multinomial", alpha = 1)

# get predicted classes at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "class")

# get predicted classes at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "class", s = "lambda.1se")

# compare misclassification rate for two solutions
1 - mean(auto$origin == fit.min)
1 - mean(auto$origin == fit.1se)



######***######   family = "Gamma"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto, family = "Gamma", alpha = 1)

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response", s = "lambda.1se")

# compare rmse for two solutions
sqrt(mean((auto$mpg - fit.min)^2))
sqrt(mean((auto$mpg - fit.1se)^2))



######***######   family = "inverse.gaussian"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto, family = "inverse.gaussian", alpha = 1)

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response", s = "lambda.1se")

# compare rmse for two solutions
sqrt(mean((auto$mpg - fit.min)^2))
sqrt(mean((auto$mpg - fit.1se)^2))


[Package grpnet version 0.3 Index]