splsicox {Coxmos}R Documentation

sPLS-ICOX

Description

This function performs a sparse partial least squares individual Cox (sPLS-ICOX) (based on plsRcox R package). The function returns a Coxmos model with the attribute model as "sPLS-ICOX".

Usage

splsicox(
  X,
  Y,
  n.comp = 4,
  penalty = 0,
  x.center = TRUE,
  x.scale = FALSE,
  remove_near_zero_variance = TRUE,
  remove_zero_variance = FALSE,
  toKeep.zv = NULL,
  remove_non_significant = FALSE,
  alpha = 0.05,
  MIN_EPV = 5,
  returnData = TRUE,
  verbose = FALSE
)

Arguments

X

Numeric matrix or data.frame. Explanatory variables. Qualitative variables must be transform into binary variables.

Y

Numeric matrix or data.frame. Response variables. Object must have two columns named as "time" and "event". For event column, accepted values are: 0/1 or FALSE/TRUE for censored and event observations.

n.comp

Numeric. Number of latent components to compute for the (s)PLS model (default: 10).

penalty

Numeric. Penalty for variable selection for the individual cox models. Variables with a lower P-Value than 1 - "penalty" in the individual cox analysis will be keep for the sPLS-ICOX approach (default: 0).

x.center

Logical. If x.center = TRUE, X matrix is centered to zero means (default: TRUE).

x.scale

Logical. If x.scale = TRUE, X matrix is scaled to unit variances (default: FALSE).

remove_near_zero_variance

Logical. If remove_near_zero_variance = TRUE, near zero variance variables will be removed (default: TRUE).

remove_zero_variance

Logical. If remove_zero_variance = TRUE, zero variance variables will be removed (default: TRUE).

toKeep.zv

Character vector. Name of variables in X to not be deleted by (near) zero variance filtering (default: NULL).

remove_non_significant

Logical. If remove_non_significant = TRUE, non-significant variables/components in final cox model will be removed until all variables are significant by forward selection (default: FALSE).

alpha

Numeric. Numerical values are regarded as significant if they fall below the threshold (default: 0.05).

MIN_EPV

Numeric. Minimum number of Events Per Variable (EPV) you want reach for the final cox model. Used to restrict the number of variables/components can be computed in final cox models. If the minimum is not meet, the model cannot be computed (default: 5).

returnData

Logical. Return original and normalized X and Y matrices (default: TRUE).

verbose

Logical. If verbose = TRUE, extra messages could be displayed (default: FALSE).

Details

The sPLS-ICOX function is an advanced analytical tool tailored for the elucidation of high-dimensional survival data. It amalgamates the principles of sparse partial least squares (sPLS) regression with individual Cox regression, thereby offering a robust mechanism for both dimension reduction and variable selection in the context of survival analysis. Rooted in the methodologies of the plsRcox R package, this function operationalizes the sPLS-ICOX model by leveraging the inherent sparsity introduced via the penalty parameter. This parameter delineates a stringent criterion for variable retention, wherein only those variables that manifest a P-Value inferior to the threshold defined by 1 - penalty in the individual Cox analysis are assimilated into the sPLS-ICOX model framework. The parameter n.comp demarcates the number of latent components to be computed for the sPLS model. These latent components, which encapsulate salient patterns within the data, subsequently underpin the Cox regression analysis. It is imperative to underscore the necessity of meticulous data preprocessing, especially in the context of qualitative variables. Such variables necessitate binary transformation prior to their integration into the function. Moreover, the function is equipped with options for data centering and scaling, pivotal operations that can significantly influence model performance. Designed with a predilection for right-censored survival data, the function mandates the structuring of the outcome or response variable Y into two distinct columns: "time", which chronicles the survival time, and "event", which catalogues the occurrence or non-occurrence of the event of interest.

Upon execution, the function yields a comprehensive list encapsulating a plethora of elements germane to the sPLS-ICOX model, inclusive of the normalized data matrices, sPLS weight vectors, loadings, scores, and an exhaustive compilation of survival model metrics.

Value

Instance of class "Coxmos" and model "sPLS-ICOX". The class contains the following elements: X: List of normalized X data information.

Y: List of normalized Y data information.

survival_model: List of survival model information.

n.comp: Number of components selected.

var_by_component: Variables selected by each component.

call: call function

X_input: X input matrix

Y_input: Y input matrix

alpha: alpha value selected

nsv: Variables removed by cox alpha cutoff.

nzv: Variables removed by remove_near_zero_variance or remove_zero_variance.

nz_coeffvar: Variables removed by coefficient variation near zero.

class: Model class.

time: time consumed for running the cox analysis.

Author(s)

Pedro Salguero Garcia. Maintainer: pedsalga@upv.edu.es

References

Bastien P, Vinzi VE, Tenenhaus M (2005). “PLS generalised linear regression.” Computational Statistics & Data Analysis. https://www.sciencedirect.com/science/article/abs/pii/S0167947304000271?via%3Dihub.

Examples

data("X_proteomic")
data("Y_proteomic")
X <- X_proteomic[,1:50]
Y <- Y_proteomic
splsicox(X, Y, n.comp = 2, penalty = 0.5, x.center = TRUE, x.scale = TRUE)

[Package Coxmos version 1.0.2 Index]