fullProcess {MLGL}R Documentation

Full process of MLGL

Description

Run hierarchical clustering following by a group-lasso on all the different partition and a hierarchical testing procedure. Only for linear regression problem.

Usage

fullProcess(X, ...)

## Default S3 method:
fullProcess(
  X,
  y,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  hc = NULL,
  fractionSampleMLGL = 1/2,
  BHclust = 50,
  nCore = NULL,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)

## S3 method for class 'formula'
fullProcess(
  formula,
  data,
  control = c("FWER", "FDR"),
  alpha = 0.05,
  test = partialFtest,
  hc = NULL,
  fractionSampleMLGL = 1/2,
  BHclust = 50,
  nCore = NULL,
  addRoot = FALSE,
  Shaffer = FALSE,
  ...
)

Arguments

X

matrix of size n*p

...

Others parameters for MLGL

y

vector of size n.

control

either "FDR" or "FWER"

alpha

control level for testing procedure

test

test used in the testing procedure. Default is partialFtest

hc

output of hclust function. If not provided, hclust is run with ward.D2 method. User can also provide the desired method: "single", "complete", "average", "mcquitty", "ward.D", "ward.D2", "centroid", "median".

fractionSampleMLGL

a real between 0 and 1: the fraction of individuals to use in the sample for MLGL (see Details).

BHclust

number of replicates for computing the distance matrix for the hierarchical clustering tree

nCore

number of cores used for distance computation. Use all cores by default.

addRoot

If TRUE, add a common root containing all the groups

Shaffer

If TRUE, a Shaffer correction is performed (only if control = "FWER")

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment (formula)

Details

Divide the n individuals in two samples. Then the three following steps are done: 1) Bootstrap Hierarchical Clustering of the variables of X 2) MLGL on the second sample of individuals 3) Hierarchical testing procedure on the first sample of individuals.

Value

a list containing:

res

output of MLGL function

lambdaOpt

lambda values maximizing the number of rejects

var

A vector containing the index of selected variables for the first lambdaOpt value

group

A vector containing the values index of selected groups for the first lambdaOpt value

selectedGroups

Selected groups for the first lambdaOpt value

reject

Selected groups for all lambda values

alpha

Control level

test

Test used in the testing procedure

control

"FDR" or "FWER"

time

Elapsed time

Author(s)

Quentin Grimonprez

See Also

MLGL, hierarchicalFDR, hierarchicalFWER, selFDR, selFWER

Examples

# least square loss
set.seed(42)
X <- simuBlockGaussian(50, 12, 5, 0.7)
y <- X[, c(2, 7, 12)] %*% c(2, 2, -2) + rnorm(50, 0, 0.5)
res <- fullProcess(X, y)

[Package MLGL version 1.0.0 Index]