asmbPLS.cv {asmbPLS} | R Documentation |
Cross-validation for asmbPLS to find the best combinations of quantiles for prediction
Description
Function to find the best combinations of quantiles used for prediction via
cross-validation. Usually should be conducted before
asmbPLS.fit
to obtain the quantile combinations.
Usage
asmbPLS.cv(
X.matrix,
Y.matrix,
PLS.comp,
X.dim,
quantile.comb.table,
Y.indicator,
k = 5,
ncv = 5,
only.observe = TRUE,
expected.measure.decrease = 0.05,
center = TRUE,
scale = TRUE,
maxiter = 100
)
Arguments
X.matrix |
Predictors matrix. Samples in rows, variables in columns. |
Y.matrix |
Outcome matrix. Samples in rows, this is a matrix with one
column (continuous variable). The outcome could be imputed survival time or
other types of continuous outcome. For survival time with right-censored
survival time and event indicator, the right censored time could be imputed
by |
PLS.comp |
Number of PLS components in asmbPLS. |
X.dim |
A vector containing the number of predictors in each block (ordered). |
quantile.comb.table |
A matrix containing user-defined quantile combinations used for CV, whose column number equals to the number of blocks. |
Y.indicator |
A vector containing the event indicator for each sample, whose length is equal to the number of samples. This vector allows the ratio of observed/unobserved to be the same in the training set and validation set. Observed = 1, and unobserved = 0. If other types of outcome data rather than survival outcome is used, you can use a vector with all components = 1 instead. |
k |
The number of folds of CV procedure. The default is 5. |
ncv |
The number of repetitions of CV. The default is 5. |
only.observe |
Whether only observed samples in the validation set should be used for calculating the MSE for CV. The default is TRUE. |
expected.measure.decrease |
The measure you expect to decrease by percent after including one more PLS component, which will affect the selection of optimal PLS components. The default is 0.05 (5%). |
center |
A logical value indicating whether mean center should be implemented for X.matrix and Y.matrix. The default is TRUE. |
scale |
A logical value indicating whether scale should be implemented for X.matrix and Y.matrix. The default is TRUE. |
maxiter |
A integer indicating the maximum number of iteration. The default number is 100. |
Value
asmbPLS.cv
returns a list containing the following components:
quantile_table_CV |
A matrix containing the selected quantile combination and the corresponding measures of CV for each PLS component. |
optimal_nPLS |
Optimal number of PLS components. |
.
Examples
## Use the example dataset
data(asmbPLS.example)
X.matrix = asmbPLS.example$X.matrix
Y.matrix = asmbPLS.example$Y.matrix
PLS.comp = asmbPLS.example$PLS.comp
X.dim = asmbPLS.example$X.dim
quantile.comb.table.cv = asmbPLS.example$quantile.comb.table.cv
Y.indicator = asmbPLS.example$Y.indicator
## cv to find the best quantile combinations for model fitting
cv.results <- asmbPLS.cv(X.matrix = X.matrix,
Y.matrix = Y.matrix,
PLS.comp = PLS.comp,
X.dim = X.dim,
quantile.comb.table = quantile.comb.table.cv,
Y.indicator = Y.indicator,
k = 5,
ncv = 3)
quantile.comb <- cv.results$quantile_table_CV[,1:length(X.dim)]
n.PLS <- cv.results$optimal_nPLS
## asmbPLS fit
asmbPLS.results <- asmbPLS.fit(X.matrix = X.matrix,
Y.matrix = Y.matrix,
PLS.comp = n.PLS,
X.dim = X.dim,
quantile.comb = quantile.comb)