splsicox {Coxmos} | R Documentation |
sPLS-ICOX
Description
This function performs a sparse partial least squares individual Cox (sPLS-ICOX) (based on plsRcox R package). The function returns a Coxmos model with the attribute model as "sPLS-ICOX".
Usage
splsicox(
X,
Y,
n.comp = 4,
penalty = 0,
x.center = TRUE,
x.scale = FALSE,
remove_near_zero_variance = TRUE,
remove_zero_variance = FALSE,
toKeep.zv = NULL,
remove_non_significant = FALSE,
alpha = 0.05,
MIN_EPV = 5,
returnData = TRUE,
verbose = FALSE
)
Arguments
X |
Numeric matrix or data.frame. Explanatory variables. Qualitative variables must be transform into binary variables. |
Y |
Numeric matrix or data.frame. Response variables. Object must have two columns named as "time" and "event". For event column, accepted values are: 0/1 or FALSE/TRUE for censored and event observations. |
n.comp |
Numeric. Number of latent components to compute for the (s)PLS model (default: 10). |
penalty |
Numeric. Penalty for variable selection for the individual cox models. Variables with a lower P-Value than 1 - "penalty" in the individual cox analysis will be keep for the sPLS-ICOX approach (default: 0). |
x.center |
Logical. If x.center = TRUE, X matrix is centered to zero means (default: TRUE). |
x.scale |
Logical. If x.scale = TRUE, X matrix is scaled to unit variances (default: FALSE). |
remove_near_zero_variance |
Logical. If remove_near_zero_variance = TRUE, near zero variance variables will be removed (default: TRUE). |
remove_zero_variance |
Logical. If remove_zero_variance = TRUE, zero variance variables will be removed (default: TRUE). |
toKeep.zv |
Character vector. Name of variables in X to not be deleted by (near) zero variance filtering (default: NULL). |
remove_non_significant |
Logical. If remove_non_significant = TRUE, non-significant variables/components in final cox model will be removed until all variables are significant by forward selection (default: FALSE). |
alpha |
Numeric. Numerical values are regarded as significant if they fall below the threshold (default: 0.05). |
MIN_EPV |
Numeric. Minimum number of Events Per Variable (EPV) you want reach for the final cox model. Used to restrict the number of variables/components can be computed in final cox models. If the minimum is not meet, the model cannot be computed (default: 5). |
returnData |
Logical. Return original and normalized X and Y matrices (default: TRUE). |
verbose |
Logical. If verbose = TRUE, extra messages could be displayed (default: FALSE). |
Details
The sPLS-ICOX
function is an advanced analytical tool tailored for the elucidation of
high-dimensional survival data. It amalgamates the principles of sparse partial least squares
(sPLS) regression with individual Cox regression, thereby offering a robust mechanism for both
dimension reduction and variable selection in the context of survival analysis.
Rooted in the methodologies of the plsRcox
R package, this function operationalizes the
sPLS-ICOX model by leveraging the inherent sparsity introduced via the penalty
parameter.
This parameter delineates a stringent criterion for variable retention, wherein only those
variables that manifest a P-Value inferior to the threshold defined by 1 - penalty
in the
individual Cox analysis are assimilated into the sPLS-ICOX model framework.
The parameter n.comp
demarcates the number of latent components to be computed for the sPLS
model. These latent components, which encapsulate salient patterns within the data, subsequently
underpin the Cox regression analysis. It is imperative to underscore the necessity of meticulous
data preprocessing, especially in the context of qualitative variables. Such variables necessitate
binary transformation prior to their integration into the function. Moreover, the function is
equipped with options for data centering and scaling, pivotal operations that can significantly
influence model performance.
Designed with a predilection for right-censored survival data, the function mandates the structuring
of the outcome or response variable Y
into two distinct columns: "time", which chronicles the
survival time, and "event", which catalogues the occurrence or non-occurrence of the event of interest.
Upon execution, the function yields a comprehensive list encapsulating a plethora of elements germane to the sPLS-ICOX model, inclusive of the normalized data matrices, sPLS weight vectors, loadings, scores, and an exhaustive compilation of survival model metrics.
Value
Instance of class "Coxmos" and model "sPLS-ICOX". The class contains the following elements:
X
: List of normalized X data information.
-
(data)
: normalized X matrix -
(weightings)
: sPLS weights -
(weightings_norm)
: sPLS normalize weights -
(W.star)
: sPLS W* vector -
(loadings)
: sPLS loadings -
(scores)
: sPLS scores/variates -
(E)
: error matrices -
(x.mean)
: mean values for X matrix -
(x.sd)
: standard deviation for X matrix
Y
: List of normalized Y data information.
-
(data)
: normalized X matrix -
(y.mean)
: mean values for Y matrix -
(y.sd)
: standard deviation for Y matrix'
survival_model
: List of survival model information.
-
fit
: coxph object. -
AIC
: AIC of cox model. -
BIC
: BIC of cox model. -
lp
: linear predictors for train data. -
coef
: Coefficients for cox model. -
YChapeau
: Y Chapeau residuals. -
Yresidus
: Y residuals.
n.comp
: Number of components selected.
var_by_component
: Variables selected by each component.
call
: call function
X_input
: X input matrix
Y_input
: Y input matrix
alpha
: alpha value selected
nsv
: Variables removed by cox alpha cutoff.
nzv
: Variables removed by remove_near_zero_variance or remove_zero_variance.
nz_coeffvar
: Variables removed by coefficient variation near zero.
class
: Model class.
time
: time consumed for running the cox analysis.
Author(s)
Pedro Salguero Garcia. Maintainer: pedsalga@upv.edu.es
References
Bastien P, Vinzi VE, Tenenhaus M (2005). “PLS generalised linear regression.” Computational Statistics & Data Analysis. https://www.sciencedirect.com/science/article/abs/pii/S0167947304000271?via%3Dihub.
Examples
data("X_proteomic")
data("Y_proteomic")
X <- X_proteomic[,1:50]
Y <- Y_proteomic
splsicox(X, Y, n.comp = 2, penalty = 0.5, x.center = TRUE, x.scale = TRUE)