SparsePLS {sharp}R Documentation

Sparse Partial Least Squares

Description

Runs a sparse Partial Least Squares model using implementation from sgPLS-package. This function is not using stability.

Usage

SparsePLS(
  xdata,
  ydata,
  Lambda,
  family = "gaussian",
  ncomp = 1,
  scale = TRUE,
  keepX_previous = NULL,
  keepY = NULL,
  ...
)

Arguments

xdata

matrix of predictors with observations as rows and variables as columns.

ydata

optional vector or matrix of outcome(s). If family is set to "binomial" or "multinomial", ydata can be a vector with character/numeric values or a factor.

Lambda

matrix of parameters controlling the number of selected predictors at current component, as defined by ncomp.

family

type of PLS model. If family="gaussian", a sparse PLS model as defined in sPLS is run (for continuous outcomes). If family="binomial", a PLS-DA model as defined in sPLSda is run (for categorical outcomes).

ncomp

number of components.

scale

logical indicating if the data should be scaled (i.e. transformed so that all variables have a standard deviation of one). Only used if family="gaussian".

keepX_previous

number of selected predictors in previous components. Only used if ncomp > 1. The argument keepX in sPLS is obtained by concatenating keepX_previous and Lambda.

keepY

number of selected outcome variables. This argument is defined as in sPLS. Only used if family="gaussian".

...

additional arguments to be passed to sPLS or sPLSda.

Value

A list with:

selected

matrix of binary selection status. Rows correspond to different model parameters. Columns correspond to predictors.

beta_full

array of model coefficients. Rows correspond to different model parameters. Columns correspond to predictors (starting with "X") or outcomes (starting with "Y") variables for different components (denoted by "PC").

References

KA LC, Rossouw D, Robert-GraniĆ© C, Besse P (2008). “A sparse PLS for variable selection when integrating omics data.” Stat Appl Genet Mol Biol, 7(1), Article 35. ISSN 1544-6115, doi:10.2202/1544-6115.1390.

See Also

VariableSelection, BiSelection

Other penalised dimensionality reduction functions: GroupPLS(), SparseGroupPLS(), SparsePCA()

Examples

if (requireNamespace("sgPLS", quietly = TRUE)) {
  ## Sparse PLS

  # Data simulation
  set.seed(1)
  simul <- SimulateRegression(n = 100, pk = 20, q = 3, family = "gaussian")
  x <- simul$xdata
  y <- simul$ydata

  # Running sPLS with 2 X-variables and 1 Y-variable
  mypls <- SparsePLS(xdata = x, ydata = y, Lambda = 2, family = "gaussian", keepY = 1)


  ## Sparse PLS-DA

  # Data simulation
  set.seed(1)
  simul <- SimulateRegression(n = 100, pk = 20, family = "binomial")

  # Running sPLS-DA with 2 X-variables and 1 Y-variable
  mypls <- SparsePLS(xdata = simul$xdata, ydata = simul$ydata, Lambda = 2, family = "binomial")
}

[Package sharp version 1.4.6 Index]