ssvdEN_sol_path_par {MOSS}R Documentation

'Solution path' for sparse Singular Value Decomposition via Elastic Net using parallel computing.

Description

This function is a copy of 'ssvdEN_sol_path' meant to be used in combination with the future.apply package to allow for parallel computing of the optimal degrees of sparsity by subjects and/or features.

Usage

ssvdEN_sol_path_par(
  O,
  center = TRUE,
  scale = TRUE,
  dg.grid.right = seq_len(ncol(O)) - 1,
  dg.grid.left = NULL,
  n.PC = 1,
  svd.0 = NULL,
  alpha.f = 1,
  alpha.s = 1,
  maxit = 500,
  tol = 0.001,
  approx = FALSE,
  plot = FALSE,
  ncores = 1,
  verbose = TRUE,
  lib.thresh = TRUE,
  left.lab = "Subjects",
  right.lab = "Features",
  exact.dg = FALSE
)

Arguments

O

Numeric matrix of n subjects (rows) and p features (columns). Only objects supported are 'matrix' and 'FBM'.

center

Should we center? Logical. Defaults to TRUE.

scale

Should we scale? Logical. Defaults to TRUE.

dg.grid.right

Grid with degrees of sparsity at the features level. Numeric. Default is the entire solution path for features (i.e. 1 : (ncol(O) - 1)).

dg.grid.left

Grid with degrees of sparsity at the subjects level. Numeric. Defaults to dg.grid.left = nrow(O).

n.PC

Number of desired principal axes. Numeric. Defaults to 1.

svd.0

Initial SVD (i.e. least squares solution). Defaults to NULL.

alpha.f

Elastic net mixture parameter at the features level. Measures the compromise between lasso (alpha = 1) and ridge (alpha = 0) types of sparsity. Numeric. Defaults to 1.

alpha.s

Elastic net mixture parameter at the subjects level. Defaults to alpha.s = 1.

maxit

Maximum number of iterations. Defaults to 500.

tol

Convergence is determined when ||U_j - U_j-1||_F < tol, where U_j is the matrix of estimated left regularized singular vectors at iteration j.

approx

Should we use standard SVD or random approximations? Defaults to FALSE. If TRUE & is(O,'matrix') == TRUE, irlba is called. If TRUE & is(O, "FBM") == TRUE, big_randomSVD is called.

plot

Should we plot the solution path? Logical. Defaults to FALSE

ncores

Number of cores used by big_randomSVD. Default does not use parallelism. Ignored when is(O, "FBM") == TRUE.

verbose

Should we print messages?. Logical. Defaults to TRUE.

lib.thresh

Should we use a liberal or conservative threshold to tune degrees of sparsity? Logical. Defaults to TRUE.

left.lab

Label for the subjects level. Character. Defaults to 'subjects'.

right.lab

Label for the features level. Character. Defaults to 'features'.

exact.dg

Should we compute exact degrees of sparsity? Logical. Defaults to FALSE. Only relevant When alpha.s or alpha.f are in the (0,1) interval and exact.dg = TRUE.

Note

Although the degree of sparsity maps onto number of features/subjects for Lasso, the user needs to be aware that this conceptual correspondence is lost for full EN (alpha belonging to (0, 1); e.g. the number of features selected with alpha < 1 will be eventually larger than the optimal degree of sparsity). This allows to rapidly increase the number of non-zero elements when tuning the degrees of sparsity. In order to get exact values for the degrees of sparsity at subjects or features levels, the user needs to set the value of 'exact.dg' parameter from 'FALSE' (the default) to 'TRUE'.

Examples


library("MOSS")

# Extracting simulated omic blocks.
sim_blocks <- simulate_data()$sim_blocks
X <- sim_blocks$`Block 3`

# Comparing ssvdEN_sol_path_par and ssvdEN_sol_path.
t1 <- proc.time()
out1 <- ssvdEN_sol_path(X, dg.grid.right = 1:1000, dg.grid.left = 1:500)
t1 <- proc.time() - t1

t2 <- proc.time()
out2 <- ssvdEN_sol_path_par(X, dg.grid.right = 1:1000, dg.grid.left = 1:500)
t2 <- proc.time() - t2


[Package MOSS version 0.2.2 Index]