stat.lasso_coefdiff {knockoff} | R Documentation |
Importance statistics based the lasso with cross-validation
Description
Fits a linear regression model via penalized maximum likelihood and cross-validation. Then, compute the difference statistic
W_j = |Z_j| - |\tilde{Z}_j|
where Z_j
and \tilde{Z}_j
are the coefficient estimates for the
jth variable and its knockoff, respectively. The value of the regularization
parameter \lambda
is selected by cross-validation and computed with glmnet
.
Usage
stat.lasso_coefdiff(X, X_k, y, cores = 2, ...)
Arguments
X |
n-by-p matrix of original variables. |
X_k |
n-by-p matrix of knockoff variables. |
y |
vector of length n, containing the response variables. It should be numeric |
cores |
Number of cores used to compute the statistics by running cv.glmnet. If not specified, the number of cores is set to approximately half of the number of cores detected by the parallel package. |
... |
additional arguments specific to |
Details
This function uses the glmnet
package to fit the lasso path and
is a wrapper around the more general stat.glmnet_coefdiff.
The statistics W_j
are constructed by taking the difference
between the coefficient of the j-th variable and its knockoff.
By default, the value of the regularization parameter is chosen by 10-fold cross-validation.
The optional nlambda
parameter can be used to control the granularity of the
grid of \lambda
's. The default value of nlambda
is 500
,
where p
is the number of columns of X
.
Unless a lambda sequence is provided by the user, this function generates it on a log-linear scale before calling 'glmnet' (default 'nlambda': 500).
For a complete list of the available additional arguments, see cv.glmnet
and glmnet
.
Value
A vector of statistics W
of length p.
See Also
Other statistics:
stat.forward_selection()
,
stat.glmnet_coefdiff()
,
stat.glmnet_lambdadiff()
,
stat.lasso_coefdiff_bin()
,
stat.lasso_lambdadiff_bin()
,
stat.lasso_lambdadiff()
,
stat.random_forest()
,
stat.sqrt_lasso()
,
stat.stability_selection()
Examples
set.seed(2022)
p=200; n=100; k=15
mu = rep(0,p); Sigma = diag(p)
X = matrix(rnorm(n*p),n)
nonzero = sample(p, k)
beta = 3.5 * (1:p %in% nonzero)
y = X %*% beta + rnorm(n)
knockoffs = function(X) create.gaussian(X, mu, Sigma)
# Basic usage with default arguments
result = knockoff.filter(X, y, knockoffs=knockoffs,
statistic=stat.lasso_coefdiff)
print(result$selected)
# Advanced usage with custom arguments
foo = stat.lasso_coefdiff
k_stat = function(X, X_k, y) foo(X, X_k, y, nlambda=200)
result = knockoff.filter(X, y, knockoffs=knockoffs, statistic=k_stat)
print(result$selected)