R: Corrected Lasso

corrected_lasso {hdme}

R Documentation

Corrected Lasso

Description

Lasso (L1-regularization) for generalized linear models with measurement error.

Usage

corrected_lasso(
  W,
  y,
  sigmaUU,
  family = c("gaussian", "binomial", "poisson"),
  radii = NULL,
  no_radii = NULL,
  alpha = 0.1,
  maxits = 5000,
  tol = 1e-12
)

Arguments

`W`	Design matrix, measured with error. Must be a numeric matrix.
`y`	Vector of responses.
`sigmaUU`	Covariance matrix of the measurement error.
`family`	Response type. Character string of length 1. Possible values are "gaussian", "binomial" and "poisson".
`radii`	Vector containing the set of radii of the l1-ball onto which the solution is projected. If not provided, the algorithm will select an evenly spaced vector of 20 radii.
`no_radii`	Length of vector radii, i.e., the number of regularization parameters to fit the corrected lasso for.
`alpha`	Step size of the projected gradient descent algorithm. Default is 0.1.
`maxits`	Maximum number of iterations of the project gradient descent algorithm for each radius. Default is 5000.
`tol`	Iteration tolerance for change in sum of squares of beta. Defaults . to 1e-12.

Details

Corrected version of the lasso for generalized linear models. The method does require an estimate of the measurement error covariance matrix. The Poisson regression option might sensitive to numerical overflow, please file a GitHub issue in the source repository if you experience this.

Value

An object of class "corrected_lasso".

References

Loh P, Wainwright MJ (2012). “High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity.” Ann. Statist., 40(3), 1637–1664.

Sorensen O, Frigessi A, Thoresen M (2015). “Measurement error in lasso: Impact and likelihood bias correction.” Statistica Sinica, 25(2), 809-829.

Examples

# Example with linear regression
# Number of samples
n <- 100
# Number of covariates
p <- 50
# True (latent) variables
X <- matrix(rnorm(n * p), nrow = n)
# Measurement error covariance matrix
# (typically estimated by replicate measurements)
sigmaUU <- diag(x = 0.2, nrow = p, ncol = p)
# Measurement matrix (this is the one we observe)
W <- X + rnorm(n, sd = sqrt(diag(sigmaUU)))
# Coefficient
beta <- c(seq(from = 0.1, to = 1, length.out = 5), rep(0, p-5))
# Response
y <- X %*% beta + rnorm(n, sd = 1)
# Run the corrected lasso
fit <- corrected_lasso(W, y, sigmaUU, family = "gaussian")
coef(fit)
plot(fit)
plot(fit, type = "path")

# Binomial, logistic regression
# Number of samples
n <- 1000
# Number of covariates
p <- 50
# True (latent) variables
X <- matrix(rnorm(n * p), nrow = n)
# Measurement error covariance matrix
sigmaUU <- diag(x = 0.2, nrow = p, ncol = p)
# Measurement matrix (this is the one we observe)
W <- X + rnorm(n, sd = sqrt(diag(sigmaUU)))
# Response
y <- rbinom(n, size = 1, prob = plogis(X %*% c(rep(5, 5), rep(0, p-5))))
fit <- corrected_lasso(W, y, sigmaUU, family = "binomial")
plot(fit)
coef(fit)

[Package hdme version 0.6.0 Index]