irc_regression {IRCcheck} | R Documentation |
Irrepresentable Condition: Regression
Description
Check the IRC in multiple regression, following Equation (2) in (Zhao and Yu 2006).
Usage
irc_regression(X, which_nonzero)
Arguments
X |
A matrix of dimensions n (observations) by p (variables). |
which_nonzero |
Numeric vector with the location of the nonzero relations (a.k.a., the active set). |
Value
infinity norm (greater than 1 the IRC is violated)
Note
It is common to take 1 - the infinity norm, thereby indicating the IRC is violated when the value is negative.
References
Zhao P, Yu B (2006). “On Model Selection Consistency of Lasso.” The Journal of Machine Learning Research, 7, 2541–2563. ISSN 15324435, doi: 10.1109/TIT.2006.883611, 1305.7477, https://doi.org/10.1109/TIT.2006.883611.
Examples
# data
# note: irc_met (block diagonal; 1st 10 active)
cors <- rbind(
cbind(matrix(.7, 10,10), matrix(0, 10,10)),
cbind(matrix(0, 10,10), matrix(0.7, 10,10))
)
diag(cors) <- 1
X <- MASS::mvrnorm(2500, rep(0, 20), Sigma = cors, empirical = TRUE)
# check IRC
irc_regression(X, which_nonzero = 1:10)
# generate data
y <- X %*% c(rep(1,10), rep(0, 10)) + rnorm(2500)
fit <- glmnet::glmnet(X, y, lambda = seq(10, 0.01, length.out = 400))
# plot
plot(fit, xvar = "lambda")
# Example (more or less) from Zhao and Yu (2006)
# section 3.3
# number of predictors
p <- 2^4
# number active (q in Zhao and Yu 2006)
n_beta <- 4/8 * p
# betas
beta <- c(rep(1, n_beta), rep(0, p - n_beta))
check <- NA
for(i in 1:100){
cors <- cov2cor(
solve(
rWishart(1, p , diag(p))[,,1]
))
# predictors
X <- MASS::mvrnorm(500, rep(0, p), Sigma = cors, empirical = TRUE)
check[i] <- irc_regression(X, which_nonzero = which(beta != 0))
}
# less than 1
mean(check < 1)
# or greater than 0
mean(1 - check > 0)
[Package IRCcheck version 1.0.0 Index]