model.fit {inferCSN}R Documentation

Fit a sparse regression model

Description

Computes the regularization path for the specified loss function and penalty function

Usage

model.fit(
  x,
  y,
  penalty = "L0",
  algorithm = "CD",
  regulators_num = NULL,
  cross_validation = FALSE,
  n_folds = 10,
  seed = 1,
  loss = "SquaredError",
  nLambda = 100,
  nGamma = 5,
  gammaMax = 10,
  gammaMin = 1e-04,
  partialSort = TRUE,
  maxIters = 200,
  rtol = 1e-06,
  atol = 1e-09,
  activeSet = TRUE,
  activeSetNum = 3,
  maxSwaps = 100,
  scaleDownFactor = 0.8,
  screenSize = 1000,
  autoLambda = NULL,
  lambdaGrid = list(),
  excludeFirstK = 0,
  intercept = TRUE,
  lows = -Inf,
  highs = Inf,
  ...
)

Arguments

x

The data matrix

y

The response vector

penalty

The type of regularization. This can take either one of the following choices: L0 and L0L2. For high-dimensional and sparse data, such as single-cell sequencing data, L0L2 is more effective.

algorithm

The type of algorithm used to minimize the objective function. Currently CD and CDPSI are supported. The CDPSI algorithm may yield better results, but it also increases running time.

regulators_num

The number of non-zore coefficients, this value will affect the final performance. The maximum support size at which to terminate the regularization path. Recommend setting this to a small fraction of min(n,p) (e.g. 0.05 * min(n,p)) as L0 regularization typically selects a small portion of non-zeros.

cross_validation

Check whether cross validation is used.

n_folds

The number of folds for cross-validation.

seed

The seed used in randomly shuffling the data for cross-validation.

loss

The loss function

nLambda

The number of Lambda values to select

nGamma

The number of Gamma values to select

gammaMax

The maximum value of Gamma when using the L0L2 penalty

gammaMin

The minimum value of Gamma when using the L0L2 penalty

partialSort

If TRUE, partial sorting will be used for sorting the coordinates to do greedy cycling. Otherwise, full sorting is used

maxIters

The maximum number of iterations (full cycles) for CD per grid point

rtol

The relative tolerance which decides when to terminate optimization (based on the relative change in the objective between iterations)

atol

The absolute tolerance which decides when to terminate optimization (based on the absolute L2 norm of the residuals)

activeSet

If TRUE, performs active set updates

activeSetNum

The number of consecutive times a support should appear before declaring support stabilization

maxSwaps

The maximum number of swaps used by CDPSI for each grid point

scaleDownFactor

This parameter decides how close the selected Lambda values are

screenSize

The number of coordinates to cycle over when performing initial correlation screening

autoLambda

Ignored parameter. Kept for backwards compatibility

lambdaGrid

A grid of Lambda values to use in computing the regularization path

excludeFirstK

This parameter takes non-negative integers

intercept

If FALSE, no intercept term is included in the model

lows

Lower bounds for coefficients

highs

Upper bounds for coefficients

...

Parameters for other methods.

Value

An S3 object describing the regularization path

Examples

data("example_matrix")
fit <- model.fit(
example_matrix[, -1],
example_matrix[, 1]
)
head(coef(fit))

[Package inferCSN version 1.0.5 Index]