predictBoostMLR {BoostMLR}R Documentation

Prediction for the multivariate longitudinal response

Description

Function returns predicted values for the response. Also, if the response is provided, function returns test set performance, optimal boosting iteration, and variable importance (VIMP).

Usage

predictBoostMLR(Object,
                x,
                tm,
                id,
                y,
                M,
                importance = FALSE,
                eps = 1e-5,
                setting_seed = FALSE,
                seed_value = 100L,
                ...)

Arguments

Object

A boosting object obtained using the function BoostMLR on the training data.

x

Data frame (or matrix) containing the test set x-values (covariates). Covariates can be time-varying or time-invariant. If x is unspecified while growing the Object, it should be unspecified here as well.

tm

Vector of test set time values. If tm is unspecified while growing the Object, it should be unspecified here as well.

id

Vector of test set subject identifier. If id is unspecified while growing the Object, it should be unspecified here as well.

y

Data frame (or matrix) containing the test set y-values (response) in case of multivariate response or a vector of y-values in case of univariate response. If y is unspecified then predicted values corresponding to x and tm can be obtained but no performance measure such as test set error and VIMP.

M

Number of boosting iterations. Value should be less than or equal to the value specified in the Object. If unspecified, value from the Object will be used.

importance

Whether to calculate standardized variable importance (VIMP) for each covariate?

eps

Tolerance value used for determining the optimal M.

setting_seed

Set setting_seed = TRUE if you intend to reproduce the result.

seed_value

Seed value.

...

Further arguments passed to or from other methods.

Details

The predicted response and performance values are obtained for the test data using the Object grown using function BoostMLR on the training data.

Value

Data

A list with elements x, tm, id and y. Additionally, the list include mean and standard deviation of x and y.

x_Names

Variable names of x.

y_Names

Variable names of y.

mu

Estimate of conditional expectation of y corresponding to the last boosting iteration.

mu_Mopt

Estimate of conditional expectation of y corresponding to the optimal boosting iteration.

Error_Rate

Test set error rate for each multivariate response across the boosting iterations.

Mopt

The optimal number of boosting iteration.

nu

Regularization parameter.

rmse

Test set standardized root mean square error (sRMSE) at the Mopt.

vimp

Standardized VIMP for each covariate. This consist of a list of length equal to the number of multivariate response. Each element from the list represents a matrix with number of rows equal to the number of covariates and the number of columns equal to the number of overlapping time intervals + 1 where the first column contains covariate main effects and all other columns contain covariate-time interaction effects.

Pred_Object

Useful for internal calculation.

Author(s)

Amol Pande and Hemant Ishwaran

References

Pande A., Ishwaran H., Blackstone E.H. (2020). Boosting for multivariate longitudinal response.

Pande A., Li L., Rajeswaran J., Ehrlinger J., Kogalur U.B., Blackstone E.H., Ishwaran H. (2017). Boosted multivariate trees for longitudinal data, Machine Learning, 106(2): 277–305.

Pande A. (2017). Boosting for longitudinal data. Ph.D. Dissertation, Miller School of Medicine, University of Miami.

See Also

BoostMLR, updateBoostMLR, simLong

Examples


##-----------------------------------------------------------------
## Multivariate Longitudinal Response
##-----------------------------------------------------------------

# Simulate data involves 3 response and 4 covariates

dta <- simLong(n = 100, ntest = 100 ,N = 5, rho =.80, model = 1, q_x = 0, 
                                  q_y = 0,type = "corCompSym")
dtaL <- dta$dtaL
trn <- dta$trn
# Boosting call: Raw values of covariates, B-spline for time, 
# no shrinkage, no estimate of rho and phi

boost.grow <- BoostMLR(x = dtaL$features[trn,], tm = dtaL$time[trn], 
                      id = dtaL$id[trn], y = dtaL$y[trn,], M = 100, VarFlag = FALSE)

boost.pred <- predictBoostMLR(Object = boost.grow, x = dtaL$features[-trn,], 
                               tm = dtaL$time[-trn], id = dtaL$id[-trn], 
                               y = dtaL$y[-trn,], importance = TRUE)
# Plot test set error
plotBoostMLR(boost.pred$Error_Rate,xlab = "m",ylab = "Test Set Error",
                                              legend_fraction_x = 0.2)


[Package BoostMLR version 1.0.3 Index]