methods {gamboostLSS}R Documentation

Methods for mboostLSS

Description

Methods for GAMLSS models fitted by boosting algorithms.

Usage

### print model
## S3 method for class 'mboostLSS'
print(x, ...)

### summarize model
## S3 method for class 'mboostLSS'
summary(object, ...)

### extract coefficients
## S3 method for class 'glmboostLSS'
coef(object, which = NULL,
     aggregate = c("sum", "cumsum", "none"),
     off2int = FALSE, parameter = names(object), ...)
## S3 method for class 'mboostLSS'
coef(object, which = NULL,
     aggregate = c("sum", "cumsum", "none"),
     parameter = names(object), ...)

### plot partial effects
## S3 method for class 'glmboostLSS'
plot(x, main = names(x), parameter = names(x),
     off2int = FALSE, ...)
## S3 method for class 'gamboostLSS'
plot(x, main = names(x), parameter = names(x), ...)

### extract and plot marginal prediction intervals
predint(x, which, pi = 0.9, newdata = NULL, ...)
PI(x, which, pi = 0.9, newdata = NULL, ...)
## S3 method for class 'predint'
plot(x, main = "Marginal Prediction Interval(s)",
     xlab = NULL, ylab = NULL, lty = c("solid", "dashed"),
     lcol = c("black", "black"), log = "", ...)

### extract mstop
## S3 method for class 'mboostLSS'
mstop(object, parameter = names(object), ...)
## S3 method for class 'oobag'
mstop(object, parameter = names(object), ...)
## S3 method for class 'cvriskLSS'
mstop(object, parameter = NULL, ...)

### set mstop
## S3 method for class 'mboostLSS'
x[i, return = TRUE, ...]

### extract risk
## S3 method for class 'mboostLSS'
risk(object, merge = FALSE, parameter = names(object), ...)

### extract selected base-learners
## S3 method for class 'mboostLSS'
selected(object, merge = FALSE, parameter = names(object), ...)

### extract fitted values
## S3 method for class 'mboostLSS'
fitted(object, parameter = names(object), ...)

### make predictions
## S3 method for class 'mboostLSS'
predict(object, newdata = NULL,
        type = c("link", "response", "class"), which = NULL,
        aggregate = c("sum", "cumsum", "none"),
        parameter = names(object), ...)

### update weights of the fitted model
## S3 method for class 'mboostLSS'
update(object, weights, oobweights = NULL,
       risk = NULL, trace = NULL, mstop = NULL, ...)

### extract model weights
## S3 method for class 'mboostLSS'
model.weights(x, ...)

Arguments

x, object

an object of the appropriate class (see usage).

which

a subset of base-learners to take into account when computing predictions or coefficients. If which is given (as an integer vector or characters corresponding to base-learners), a list or matrix is returned. In plot_PI the argument which must be specified and it must be given as a character string containing the name of the variable.

aggregate

a character specifying how to aggregate predictions or coefficients of single base-learners. The default returns the prediction or coefficient for the final number of boosting iterations. "cumsum" returns a matrix with the predictions for all iterations simultaneously (in columns). "none" returns a list with matrices where the jth columns of the respective matrix contains the predictions of the base-learner of the jth boosting iteration (and zero if the base-learner is not selected in this iteration).

parameter

This can be either a vector of indices or a vector of parameter names which should be processed. See expamles for details. Per default all distribution parameters of the GAMLSS family are returned.

off2int

logical indicating whether the offset should be added to the intercept (if there is any) or if the offset is neglected for plotting (default).

merge

logical. Should the risk vectors of the single components be merged to one risk vector for the model in total? Per default (merge = FALSE) a (named) list of risk vectors is returned.

i

integer. Index specifying the model to extract. If i is smaller than the initial mstop, a subset is used. If i is larger than the initial mstop, additional boosting steps are performed until step i is reached. One can specify a scalar, a (possibly named) vector or a (possibly named) list with separate values for each component. See the details section of mboostLSS for more information.

return

a logical indicating whether the changed object is returned.

main

a title for the plots.

xlab, ylab

x- and y axis labels for the plots.

pi

the level(s) of the prediction interval(s); Per default a 90% prediction interval is used.

lty

(vector) of line types to be used for plotting the prediction intervals. The vector should contain length(pi) + 1 elements. If less elements are specified, the last element is recycled. The first value lty[1] is used for the marginal median, the second value lty[2] is used for the pi[1] prediction interval, etc.

lcol

(vector) of (line) colors to be used for plotting the prediction intervals. The vector should contain length(pi) + 1 elements. If less elements are specified, the last element is recycled. The first value lcol[1] is used for the marginal median, the second value lcol[2] is used for the pi[1] prediction interval, etc.

log

a character string which determines if and if so which axis should be logarithmic. See plot.default for details.

newdata

optional; A data frame in which to look for variables with which to predict or with which to plot the marginal prediction intervals.

type

the type of prediction required. The default is on the scale of the predictors; the alternative "response" is on the scale of the response variable. Thus for a binomial model the default predictions are on the log-odds scale (probabilities on logit scale) and type = "response" gives the predicted probabilities. The "class" option returns predicted classes.

weights

a numeric vector of weights for the model

oobweights

an additional vector of out-of-bag weights (used internally by cvrisk. For details see there.).

risk

a character indicating how the empirical risk should be computed for each boosting iteration. Per default risk is set to the risk type specified for model fitting via boost_control. For details and alternatives see there.

trace

a logical triggering printout of status information during the fitting process.

mstop

number of boosting iterations.

...

Further arguments to the functions.

Details

These functions can be used to extract details from fitted models. For a tutorial with worked examples see Hofner et al. (2016).

print shows a dense representation of the model fit.

The function coef extracts the regression coefficients of linear predictors fitted using the glmboostLSS function or additive predictors fitted using gamboostLSS. Per default, only coefficients of selected base-learners are returned for all distribution parameters. However, any desired coefficient can be extracted using the which argument. Furhtermore, one can extract only coefficients for a single distribution parameter via the parameter argument (see examples for details).

Analogical, the function plot per default displays the coefficient paths for the complete GAMLSS but can be restricted to single distribution parameters or covariates (or subsets) using the parameter or which arguments, respectively.

The function predint (or PI which is just an alias) computes marginal prediction intervals and returns a data frame with the predictors used for the marginal prediction interval, the computed median prediction and the marginal prediction intervals. A plot function (plot.predint) for the resulting object exists. Note that marginal predictions from AFT models (i.e., families LogLogLSS, LogNormalLSS, and WeibullLSS) represent the predicted “true” survival time and not the observed survival time which is possible subject to censoring. Hence, comparing observed survival times with the marginal prediction interval is only sensible for uncensored observations.

The predict function can be used for predictions for the distribution parameters depending on new observations whereas fitted extracts the regression fits for the observations in the learning sample. For predict, newdata can be specified – otherwise the fitted values are returned. If which is specified, marginal effects of the corresponding base-learner(s) are returned. The argument type can be used to make predictions on the scale of the link (i.e., the linear predictor X * beta), the response (i.e. h(X * beta), where h is the response function) or the class (in case of classification).

The function update updates models fit with gamboostLSS and is primarily used within cvrisk. It updates the weights and refits the model to the altered data. Furthermore, the type of risk, the trace and the number of boosting iterations mstop can be modified.

The function model.weights is a generic version of the same function provided by package stats, which is required to make model.weights work with mboostLSS models.

Warning

The [.mboostLSS function changes the original object, i.e., LSSmodel[10] changes LSSmodel directly!

References

B. Hofner, A. Mayr, M. Schmid (2016). gamboostLSS: An R Package for Model Building and Variable Selection in the GAMLSS Framework. Journal of Statistical Software, 74(1), 1-31.

Available as vignette("gamboostLSS_Tutorial").

Mayr, A., Fenske, N., Hofner, B., Kneib, T. and Schmid, M. (2012): Generalized additive models for location, scale and shape for high-dimensional data - a flexible approach based on boosting. Journal of the Royal Statistical Society, Series C (Applied Statistics) 61(3): 403-427.

Buehlmann, P. and Hothorn, T. (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, 22(4), 477–505.

Rigby, R. A. and D. M. Stasinopoulos (2005). Generalized additive models for location, scale and shape (with discussion). Journal of the Royal Statistical Society, Series C (Applied Statistics), 54, 507-554.

See Also

glmboostLSS, gamboostLSS and blackboostLSS for fitting of GAMLSS.

Available distributions (families) are documented here: Families.

See methods in the mboost package for the corresponding methods for mboost objects.

Examples


### generate data
set.seed(1907)
x1 <- rnorm(1000)
x2 <- rnorm(1000)
x3 <- rnorm(1000)
x4 <- rnorm(1000)
x5 <- rnorm(1000)
x6 <- rnorm(1000)
mu    <- exp(1.5 + x1^2 +0.5 * x2 - 3 * sin(x3) -1 * x4)
sigma <- exp(-0.2 * x4 +0.2 * x5 +0.4 * x6)
y <- numeric(1000)
for( i in 1:1000)
    y[i] <- rnbinom(1, size = sigma[i], mu = mu[i])
dat <- data.frame(x1, x2, x3, x4, x5, x6, y)

### fit a model
model <- gamboostLSS(y ~ ., families = NBinomialLSS(), data = dat,
                     control = boost_control(mstop = 100))

### Do not test the following line per default on CRAN as it takes some time to run:
### use a model with more iterations for a better fit
mstop(model) <- 400

### extract coefficients
coef(model)

### only for distribution parameter mu
coef(model, parameter = "mu")

### only for covariate x1
coef(model, which = "x1")


### plot complete model
par(mfrow = c(4, 3))
plot(model)
### plot first parameter only
par(mfrow = c(2, 3))
plot(model, parameter = "mu")
### now plot only effect of x3 of both parameters
par(mfrow = c(1, 2))
plot(model, which = "x3")
### first component second parameter (sigma)
par(mfrow = c(1, 1))
plot(model, which = 1, parameter = 2)

### Do not test the following code per default on CRAN as it takes some time to run:
### plot marginal prediction interval
pi <- predint(model, pi = 0.9, which = "x1")
pi <- predint(model, pi = c(0.8, 0.9), which = "x1")
plot(pi, log = "y")  # warning as some y values are below 0
## here it would be better to plot x1 against
## sqrt(y) and sqrt(pi)

### set model to mstop = 300 (one-dimensional)
mstop(model) <- 300
### END (don't test automatically)


par(mfrow = c(2, 2))
plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])

### Do not test the following code per default on CRAN as it takes some time to run:
### get back to orignal fit
mstop(model) <- 400
plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])

### use different mstop values for the components
mstop(model) <- c(100, 200)
## same as
  mstop(model) <- c(mu = 100, sigma = 200)
## or
  mstop(model) <- list(mu = 100, sigma = 200)
## or
  mstop(model) <- list(100, 200)

plot(risk(model, parameter = "mu")[[1]])
plot(risk(model, parameter = "sigma")[[1]])
### END (don't test automatically)


[Package gamboostLSS version 2.0-7 Index]