GroupPLS {sharp}R Documentation

Group Partial Least Squares

Description

Runs a group Partial Least Squares model using implementation from sgPLS-package. This function is not using stability.

Usage

GroupPLS(
  xdata,
  ydata,
  family = "gaussian",
  group_x,
  group_y = NULL,
  Lambda,
  keepX_previous = NULL,
  keepY = NULL,
  ncomp = 1,
  scale = TRUE,
  ...
)

Arguments

xdata

matrix of predictors with observations as rows and variables as columns.

ydata

optional vector or matrix of outcome(s). If family is set to "binomial" or "multinomial", ydata can be a vector with character/numeric values or a factor.

family

type of PLS model. If family="gaussian", a group PLS model as defined in gPLS is run (for continuous outcomes). If family="binomial", a PLS-DA model as defined in gPLSda is run (for categorical outcomes).

group_x

vector encoding the grouping structure among predictors. This argument indicates the number of variables in each group.

group_y

optional vector encoding the grouping structure among outcomes. This argument indicates the number of variables in each group.

Lambda

matrix of parameters controlling the number of selected groups at current component, as defined by ncomp.

keepX_previous

number of selected groups in previous components. Only used if ncomp > 1. The argument keepX in sgPLS is obtained by concatenating keepX_previous and Lambda.

keepY

number of selected groups of outcome variables. This argument is defined as in sgPLS. Only used if family="gaussian".

ncomp

number of components.

scale

logical indicating if the data should be scaled (i.e. transformed so that all variables have a standard deviation of one). Only used if family="gaussian".

...

additional arguments to be passed to gPLS or gPLSda.

Value

A list with:

selected

matrix of binary selection status. Rows correspond to different model parameters. Columns correspond to predictors.

beta_full

array of model coefficients. Rows correspond to different model parameters. Columns correspond to predictors (starting with "X") or outcomes (starting with "Y") variables for different components (denoted by "PC").

References

Liquet B, de Micheaux PL, Hejblum BP, ThiĆ©baut R (2016). “Group and sparse group partial least square approaches applied in genomics context.” Bioinformatics, 32(1), 35-42. ISSN 1367-4803, doi:10.1093/bioinformatics/btv535.

See Also

VariableSelection, BiSelection

Other penalised dimensionality reduction functions: SparseGroupPLS(), SparsePCA(), SparsePLS()

Examples

if (requireNamespace("sgPLS", quietly = TRUE)) {
  ## Group PLS
  # Data simulation
  set.seed(1)
  simul <- SimulateRegression(n = 100, pk = 50, q = 3, family = "gaussian")
  x <- simul$xdata
  y <- simul$ydata

  # Running gPLS with 1 group and sparsity of 0.5
  mypls <- GroupPLS(
    xdata = x, ydata = y, Lambda = 1, family = "gaussian",
    group_x = c(10, 15, 25),
  )

  # Running gPLS with groups on outcomes
  mypls <- GroupPLS(
    xdata = x, ydata = y, Lambda = 1, family = "gaussian",
    group_x = c(10, 15, 25),
    group_y = c(2, 1), keepY = 1
  )
}

[Package sharp version 1.4.6 Index]