ssvdEN_sol_path {MOSS}R Documentation

'Solution path' for sparse Singular Value Decomposition via Elastic Net.

Description

This function allows to explore values on the solution path of the sparse singular value decomposition (SVD) problem. The goal of this is to tune the degree of sparsity of subjects, features, or both subjects/features. The function performs a penalized SVD that imposes sparsity/smoothing in both left and right singular vectors. The penalties at both levels are Elastic Net-like, and the trade-off between ridge and Lasso like penalties is controlled by two 'alpha' parameters. The proportion of variance explained is the criteria used to choose the optimal degrees of sparsity.

Usage

ssvdEN_sol_path(
  O,
  center = TRUE,
  scale = TRUE,
  dg.grid.right = seq_len(ncol(O)) - 1,
  dg.grid.left = NULL,
  n.PC = 1,
  svd.0 = NULL,
  alpha.f = 1,
  alpha.s = 1,
  maxit = 500,
  tol = 0.001,
  approx = FALSE,
  plot = FALSE,
  ncores = 1,
  verbose = TRUE,
  lib.thresh = TRUE,
  left.lab = "Subjects",
  right.lab = "Features",
  exact.dg = FALSE
)

Arguments

O

Numeric matrix of n subjects (rows) and p features (columns). Only objects supported are 'matrix' and 'FBM'.

center

Should we center? Logical. Defaults to TRUE.

scale

Should we scale? Logical. Defaults to TRUE.

dg.grid.right

Grid with degrees of sparsity at the features level. Numeric. Default is the entire solution path for features (i.e. 1 : (ncol(O) - 1)).

dg.grid.left

Grid with degrees of sparsity at the subjects level. Numeric. Defaults to dg.grid.left = nrow(O).

n.PC

Number of desired principal axes. Numeric. Defaults to 1.

svd.0

Initial SVD (i.e. least squares solution). Defaults to NULL.

alpha.f

Elastic net mixture parameter at the features level. Measures the compromise between lasso (alpha = 1) and ridge (alpha = 0) types of sparsity. Numeric. Defaults to 1.

alpha.s

Elastic net mixture parameter at the subjects level. Defaults to alpha.s = 1.

maxit

Maximum number of iterations. Defaults to 500.

tol

Convergence is determined when ||U_j - U_j-1||_F < tol, where U_j is the matrix of estimated left regularized singular vectors at iteration j.

approx

Should we use standard SVD or random approximations? Defaults to FALSE. If TRUE & is(O,'matrix') == TRUE, irlba is called. If TRUE & is(O, "FBM") == TRUE, big_randomSVD is called.

plot

Should we plot the solution path? Logical. Defaults to FALSE

ncores

Number of cores used by big_randomSVD. Default does not use parallelism. Ignored when is(O, "FBM") == TRUE.

verbose

Should we print messages?. Logical. Defaults to TRUE.

lib.thresh

Should we use a liberal or conservative threshold to tune degrees of sparsity? Logical. Defaults to TRUE.

left.lab

Label for the subjects level. Character. Defaults to 'subjects'.

right.lab

Label for the features level. Character. Defaults to 'features'.

exact.dg

Should we compute exact degrees of sparsity? Logical. Defaults to FALSE. Only relevant When alpha.s or alpha.f are in the (0,1) interval and exact.dg = TRUE.

Details

The function returns the degree of sparsity for which the change in PEV is the steepest ('liberal' option), or for which the change in PEV stabilizes ('conservative' option). This heuristics relax the need of tuning parameters on a testing set.

For one PC (rank 1 case), the algorithm finds vectors u, w that minimize: ||x - u w'||_F^2 + lambda_w (alpha_w||w||_1 + (1 - alpha_w)||w||_F^2) + lambda_u (alpha||u||_1 + (1 - alpha_u)||u||_F^2) such that ||u|| = 1. The right Eigen vector is obtained from v = w / ||w|| and the corresponding Eigen value = u^T x v. The penalties lambda_u and lambda_w are mapped from specified desired degrees of sparsity (dg.spar.features & dg.spar.subjects).

Value

A list with the results of the (sparse) SVD and (if argument 'plot'=TRUE) the corresponding graphical displays.

Note

Although the degree of sparsity maps onto number of features/subjects for Lasso, the user needs to be aware that this conceptual correspondence is lost for full EN (alpha belonging to (0, 1); e.g. the number of features selected with alpha < 1 will be eventually larger than the optimal degree of sparsity). This allows to rapidly increase the number of non-zero elements when tuning the degrees of sparsity. In order to get exact values for the degrees of sparsity at subjects or features levels, the user needs to set the value of 'exact.dg' parameter from 'FALSE' (the default) to 'TRUE'.

References

Examples

library("MOSS")

# Extracting simulated omic blocks.
sim_blocks <- simulate_data()$sim_blocks
X <- sim_blocks$`Block 3`

# Tuning sparsity degree for features (increments of 20 units).
out <- ssvdEN_sol_path(X, dg.grid.right = seq(1, 1000, by = 20))

[Package MOSS version 0.2.2 Index]