R: Fit a sparse regression model

model.fit {inferCSN}

R Documentation

Fit a sparse regression model

Description

Computes the regularization path for the specified loss function and penalty function

Usage

model.fit(
  x,
  y,
  penalty = "L0",
  algorithm = "CD",
  regulators_num = NULL,
  cross_validation = FALSE,
  n_folds = 10,
  seed = 1,
  loss = "SquaredError",
  nLambda = 100,
  nGamma = 5,
  gammaMax = 10,
  gammaMin = 1e-04,
  partialSort = TRUE,
  maxIters = 200,
  rtol = 1e-06,
  atol = 1e-09,
  activeSet = TRUE,
  activeSetNum = 3,
  maxSwaps = 100,
  scaleDownFactor = 0.8,
  screenSize = 1000,
  autoLambda = NULL,
  lambdaGrid = list(),
  excludeFirstK = 0,
  intercept = TRUE,
  lows = -Inf,
  highs = Inf,
  ...
)

Arguments

`x`	The data matrix
`y`	The response vector
`penalty`	The type of regularization. This can take either one of the following choices: `L0` and `L0L2`. For high-dimensional and sparse data, such as single-cell sequencing data, `L0L2` is more effective.
`algorithm`	The type of algorithm used to minimize the objective function. Currently `CD` and `CDPSI` are supported. The `CDPSI` algorithm may yield better results, but it also increases running time.
`regulators_num`	The number of non-zore coefficients, this value will affect the final performance. The maximum support size at which to terminate the regularization path. Recommend setting this to a small fraction of min(n,p) (e.g. 0.05 * min(n,p)) as L0 regularization typically selects a small portion of non-zeros.
`cross_validation`	Check whether cross validation is used.
`n_folds`	The number of folds for cross-validation.
`seed`	The seed used in randomly shuffling the data for cross-validation.
`loss`	The loss function
`nLambda`	The number of Lambda values to select
`nGamma`	The number of Gamma values to select
`gammaMax`	The maximum value of Gamma when using the L0L2 penalty
`gammaMin`	The minimum value of Gamma when using the L0L2 penalty
`partialSort`	If TRUE, partial sorting will be used for sorting the coordinates to do greedy cycling. Otherwise, full sorting is used
`maxIters`	The maximum number of iterations (full cycles) for CD per grid point
`rtol`	The relative tolerance which decides when to terminate optimization (based on the relative change in the objective between iterations)
`atol`	The absolute tolerance which decides when to terminate optimization (based on the absolute L2 norm of the residuals)
`activeSet`	If TRUE, performs active set updates
`activeSetNum`	The number of consecutive times a support should appear before declaring support stabilization
`maxSwaps`	The maximum number of swaps used by CDPSI for each grid point
`scaleDownFactor`	This parameter decides how close the selected Lambda values are
`screenSize`	The number of coordinates to cycle over when performing initial correlation screening
`autoLambda`	Ignored parameter. Kept for backwards compatibility
`lambdaGrid`	A grid of Lambda values to use in computing the regularization path
`excludeFirstK`	This parameter takes non-negative integers
`intercept`	If FALSE, no intercept term is included in the model
`lows`	Lower bounds for coefficients
`highs`	Upper bounds for coefficients
`...`	Parameters for other methods.

Value

An S3 object describing the regularization path

Examples

data("example_matrix")
fit <- model.fit(
example_matrix[, -1],
example_matrix[, 1]
)
head(coef(fit))

[Package inferCSN version 1.0.5 Index]