R: Prediction for the multivariate longitudinal response

predictBoostMLR {BoostMLR}

R Documentation

Prediction for the multivariate longitudinal response

Description

Function returns predicted values for the response. Also, if the response is provided, function returns test set performance, optimal boosting iteration, and variable importance (VIMP).

Usage

predictBoostMLR(Object,
                x,
                tm,
                id,
                y,
                M,
                importance = FALSE,
                eps = 1e-5,
                setting_seed = FALSE,
                seed_value = 100L,
                ...)

Arguments

`Object`	A boosting object obtained using the function `BoostMLR` on the training data.
`x`	Data frame (or matrix) containing the test set x-values (covariates). Covariates can be time-varying or time-invariant. If `x` is unspecified while growing the `Object`, it should be unspecified here as well.
`tm`	Vector of test set time values. If `tm` is unspecified while growing the `Object`, it should be unspecified here as well.
`id`	Vector of test set subject identifier. If `id` is unspecified while growing the `Object`, it should be unspecified here as well.
`y`	Data frame (or matrix) containing the test set y-values (response) in case of multivariate response or a vector of y-values in case of univariate response. If `y` is unspecified then predicted values corresponding to `x` and `tm` can be obtained but no performance measure such as test set error and VIMP.
`M`	Number of boosting iterations. Value should be less than or equal to the value specified in the `Object`. If unspecified, value from the `Object` will be used.
`importance`	Whether to calculate standardized variable importance (VIMP) for each covariate?
`eps`	Tolerance value used for determining the optimal `M`.
`setting_seed`	Set `setting_seed` = TRUE if you intend to reproduce the result.
`seed_value`	Seed value.
`...`	Further arguments passed to or from other methods.

Details

The predicted response and performance values are obtained for the test data using the Object grown using function BoostMLR on the training data.

Value

`Data`	A list with elements `x`, `tm`, `id` and `y`. Additionally, the list include mean and standard deviation of `x` and `y`.
`x_Names`	Variable names of `x`.
`y_Names`	Variable names of `y`.
`mu`	Estimate of conditional expectation of `y` corresponding to the last boosting iteration.
`mu_Mopt`	Estimate of conditional expectation of `y` corresponding to the optimal boosting iteration.
`Error_Rate`	Test set error rate for each multivariate response across the boosting iterations.
`Mopt`	The optimal number of boosting iteration.
`nu`	Regularization parameter.
`rmse`	Test set standardized root mean square error (sRMSE) at the `Mopt`.
`vimp`	Standardized VIMP for each covariate. This consist of a list of length equal to the number of multivariate response. Each element from the list represents a matrix with number of rows equal to the number of covariates and the number of columns equal to the number of overlapping time intervals + 1 where the first column contains covariate main effects and all other columns contain covariate-time interaction effects.
`Pred_Object`	Useful for internal calculation.

Author(s)

Amol Pande and Hemant Ishwaran

References

Pande A., Ishwaran H., Blackstone E.H. (2020). Boosting for multivariate longitudinal response.

Pande A., Li L., Rajeswaran J., Ehrlinger J., Kogalur U.B., Blackstone E.H., Ishwaran H. (2017). Boosted multivariate trees for longitudinal data, Machine Learning, 106(2): 277–305.

Pande A. (2017). Boosting for longitudinal data. Ph.D. Dissertation, Miller School of Medicine, University of Miami.

Examples


##-----------------------------------------------------------------
## Multivariate Longitudinal Response
##-----------------------------------------------------------------

# Simulate data involves 3 response and 4 covariates

dta <- simLong(n = 100, ntest = 100 ,N = 5, rho =.80, model = 1, q_x = 0, 
                                  q_y = 0,type = "corCompSym")
dtaL <- dta$dtaL
trn <- dta$trn
# Boosting call: Raw values of covariates, B-spline for time, 
# no shrinkage, no estimate of rho and phi

boost.grow <- BoostMLR(x = dtaL$features[trn,], tm = dtaL$time[trn], 
                      id = dtaL$id[trn], y = dtaL$y[trn,], M = 100, VarFlag = FALSE)

boost.pred <- predictBoostMLR(Object = boost.grow, x = dtaL$features[-trn,], 
                               tm = dtaL$time[-trn], id = dtaL$id[-trn], 
                               y = dtaL$y[-trn,], importance = TRUE)
# Plot test set error
plotBoostMLR(boost.pred$Error_Rate,xlab = "m",ylab = "Test Set Error",
                                              legend_fraction_x = 0.2)

[Package BoostMLR version 1.0.3 Index]