ddsPLS {ddsPLS}R Documentation

Data-Driven Sparse Partial Least Squares

Description

The main function of the package. It does both start the ddsPLS algorithm, using bootstrap analysis. Also it estimates automatically the number of components and the regularization coefficients. One regularization parameter per component only is needed to select both in x and in y. Build the optimal model, of the class ddsPLS. Among the different parameters, the lambda is the vector of parameters that are tested by the algorithm along each component for each bootstrap sample. The total number of bootstrap samples is fixed by the parameter n_B, for this parameter, the more the merrier, even if costs more in computation time. This gives access to 3 S3 methods (summary.ddsPLS, plot.ddsPLS and predict.ddsPLS).

Usage

ddsPLS(
  X,
  Y,
  criterion = "diffR2Q2",
  doBoot = TRUE,
  LD = FALSE,
  lambdas = NULL,
  n_B = 50,
  n_lambdas = 100,
  lambda_roof = NULL,
  lowQ2 = 0,
  NCORES = 1,
  errorMin = 1e-09,
  verbose = FALSE
)

Arguments

X

matrix, the covariate matrix (n,p).

Y

matrix, the response matrix (n,q).

criterion

character, whether diffR2Q2 to be minimized, default, or Q2 to be maximized.

doBoot

logical, whether performing bootstrap operations, default to TRUE. If equal to FALSE, a model with is built on the parameters lambda and the number of components is the length of this vector. In that context, the parameter n_B is ignored. If equal to TRUE, the ddsPLS algorithm, through bootstrap validation, is started using lambda as a grid and n_B as the total number of bootstrap samples to simulate per component.

LD

Boolean, wether or not consider Low-Dimensional dataset.

lambdas

vector, the to be tested values for lambda. Each value for lambda can be interpreted in terms of correlation allowed in the model. More precisely, a covariate 'x[j]' is not selected if its empirical correlation with all the response variables 'y[1..q]' is below lambda. A response variable 'y[k]' is not selected if its empirical correlation with all the covariates 'x[1..p]' is below lambda. Default to seq(0,1,length.out = 30).

n_B

integer, the number of to be simulated bootstrap samples. Default to 50.

n_lambdas

integer, the number of lambda values. Taken into account only if lambdas is NULL. Default to 100.

lambda_roof

limit value to be considered in the optimization.

lowQ2

real, the minimum value of Q^2_B to accept the current lambda value. Default to 0.0.

NCORES

integer, the number of cores used. Default to 1.

errorMin

real, not to be used.

verbose

boolean, whether to print current results. Defaut to FALSE.

Value

A list with different interesting output describing the built model

See Also

summary.ddsPLS, plot.ddsPLS, predict.ddsPLS

Examples

# n <- 100 ; d <- 2 ; p <- 20 ; q <- 2
# phi <- matrix(rnorm(n*d),n,d)
# a <- rep(1,p/4) ; b <- rep(1,p/2)
# X <- phi%*%matrix(c(1*a,0*a,0*b,
#                     1*a,3*b,0*a),nrow = d,byrow = TRUE) + matrix(rnorm(n*p),n,p)
# Y <- phi%*%matrix(c(1,0,
#                     0,0),nrow = d,byrow = TRUE) + matrix(rnorm(n*q),n,q)
# model_ddsPLS <- ddsPLS(X,Y,verbose=TRUE)


[Package ddsPLS version 1.2.1 Index]