asmbPLS.cv {asmbPLS}R Documentation

Cross-validation for asmbPLS to find the best combinations of quantiles for prediction

Description

Function to find the best combinations of quantiles used for prediction via cross-validation. Usually should be conducted before asmbPLS.fit to obtain the quantile combinations.

Usage

asmbPLS.cv(
  X.matrix,
  Y.matrix,
  PLS.comp,
  X.dim,
  quantile.comb.table,
  Y.indicator,
  k = 5,
  ncv = 5,
  only.observe = TRUE,
  expected.measure.decrease = 0.05,
  center = TRUE,
  scale = TRUE,
  maxiter = 100
)

Arguments

X.matrix

Predictors matrix. Samples in rows, variables in columns.

Y.matrix

Outcome matrix. Samples in rows, this is a matrix with one column (continuous variable). The outcome could be imputed survival time or other types of continuous outcome. For survival time with right-censored survival time and event indicator, the right censored time could be imputed by meanimp.

PLS.comp

Number of PLS components in asmbPLS.

X.dim

A vector containing the number of predictors in each block (ordered).

quantile.comb.table

A matrix containing user-defined quantile combinations used for CV, whose column number equals to the number of blocks.

Y.indicator

A vector containing the event indicator for each sample, whose length is equal to the number of samples. This vector allows the ratio of observed/unobserved to be the same in the training set and validation set. Observed = 1, and unobserved = 0. If other types of outcome data rather than survival outcome is used, you can use a vector with all components = 1 instead.

k

The number of folds of CV procedure. The default is 5.

ncv

The number of repetitions of CV. The default is 5.

only.observe

Whether only observed samples in the validation set should be used for calculating the MSE for CV. The default is TRUE.

expected.measure.decrease

The measure you expect to decrease by percent after including one more PLS component, which will affect the selection of optimal PLS components. The default is 0.05 (5%).

center

A logical value indicating whether mean center should be implemented for X.matrix and Y.matrix. The default is TRUE.

scale

A logical value indicating whether scale should be implemented for X.matrix and Y.matrix. The default is TRUE.

maxiter

A integer indicating the maximum number of iteration. The default number is 100.

Value

asmbPLS.cv returns a list containing the following components:

quantile_table_CV

A matrix containing the selected quantile combination and the corresponding measures of CV for each PLS component.

optimal_nPLS

Optimal number of PLS components.

.

Examples

## Use the example dataset
data(asmbPLS.example)
X.matrix = asmbPLS.example$X.matrix
Y.matrix = asmbPLS.example$Y.matrix
PLS.comp = asmbPLS.example$PLS.comp
X.dim = asmbPLS.example$X.dim
quantile.comb.table.cv = asmbPLS.example$quantile.comb.table.cv
Y.indicator = asmbPLS.example$Y.indicator

## cv to find the best quantile combinations for model fitting
cv.results <- asmbPLS.cv(X.matrix = X.matrix, 
                         Y.matrix = Y.matrix, 
                         PLS.comp = PLS.comp, 
                         X.dim = X.dim, 
                         quantile.comb.table = quantile.comb.table.cv, 
                         Y.indicator = Y.indicator,
                         k = 5,
                         ncv = 3)
quantile.comb <- cv.results$quantile_table_CV[,1:length(X.dim)]
n.PLS <- cv.results$optimal_nPLS
 
## asmbPLS fit
asmbPLS.results <- asmbPLS.fit(X.matrix = X.matrix, 
                               Y.matrix = Y.matrix, 
                               PLS.comp = n.PLS, 
                               X.dim = X.dim, 
                               quantile.comb = quantile.comb)


[Package asmbPLS version 1.0.0 Index]