LambdaGridRegression {sharp}R Documentation

Grid of penalty parameters (regression model)

Description

Generates a relevant grid of penalty parameter values for penalised regression using the implementation in glmnet.

Usage

LambdaGridRegression(
  xdata,
  ydata,
  tau = 0.5,
  seed = 1,
  family = "gaussian",
  resampling = "subsampling",
  Lambda_cardinal = 100,
  check_input = TRUE,
  ...
)

Arguments

xdata

matrix of predictors with observations as rows and variables as columns.

ydata

optional vector or matrix of outcome(s). If family is set to "binomial" or "multinomial", ydata can be a vector with character/numeric values or a factor.

tau

subsample size. Only used if resampling="subsampling" and cpss=FALSE.

seed

value of the seed to initialise the random number generator and ensure reproducibility of the results (see set.seed).

family

type of regression model. This argument is defined as in glmnet. Possible values include "gaussian" (linear regression), "binomial" (logistic regression), "multinomial" (multinomial regression), and "cox" (survival analysis).

resampling

resampling approach. Possible values are: "subsampling" for sampling without replacement of a proportion tau of the observations, or "bootstrap" for sampling with replacement generating a resampled dataset with as many observations as in the full sample. Alternatively, this argument can be a function to use for resampling. This function must use arguments named data and tau and return the IDs of observations to be included in the resampled dataset.

Lambda_cardinal

number of values in the grid of parameters controlling the level of sparsity in the underlying algorithm.

check_input

logical indicating if input values should be checked (recommended).

...

additional parameters passed to the functions provided in implementation or resampling.

Value

A matrix of lambda values with one column and as many rows as indicated in Lambda_cardinal.

See Also

Other lambda grid functions: LambdaGridGraphical(), LambdaSequence()

Examples

# Data simulation
set.seed(1)
simul <- SimulateRegression(n = 100, pk = 50, family = "gaussian") # simulated data

# Lambda grid for linear regression
Lambda <- LambdaGridRegression(
  xdata = simul$xdata, ydata = simul$ydata,
  family = "gaussian", Lambda_cardinal = 20
)

# Grid can be used in VariableSelection()
stab <- VariableSelection(
  xdata = simul$xdata, ydata = simul$ydata,
  family = "gaussian", Lambda = Lambda
)
print(SelectedVariables(stab))

[Package sharp version 1.4.6 Index]