R: Importance statistics based on the lasso

stat.lasso_lambdadiff {knockoff}

R Documentation

Importance statistics based on the lasso

Description

Fit the lasso path and computes the difference statistic

W_j = Z_j - \tilde{Z}_j

where Z_j and \tilde{Z}_j are the maximum values of the regularization parameter \lambda at which the jth variable and its knockoff enter the penalized linear regression model, respectively.

Usage

stat.lasso_lambdadiff(X, X_k, y, ...)

Arguments

`X`	n-by-p matrix of original variables.
`X_k`	n-by-p matrix of knockoff variables.
`y`	vector of length n, containing the response variables. It should be numeric.
`...`	additional arguments specific to `glmnet` (see Details).

Details

This function uses glmnet to compute the lasso path on a fine grid of \lambda's and is a wrapper around the more general stat.glmnet_lambdadiff.

The nlambda parameter can be used to control the granularity of the grid of \lambda's. The default value of nlambda is 500.

Unless a lambda sequence is provided by the user, this function generates it on a log-linear scale before calling glmnet (default 'nlambda': 500).

For a complete list of the available additional arguments, see glmnet or lars.

Value

A vector of statistics W of length p.

Examples

set.seed(2022)
p=200; n=100; k=15
mu = rep(0,p); Sigma = diag(p)
X = matrix(rnorm(n*p),n)
nonzero = sample(p, k)
beta = 3.5 * (1:p %in% nonzero)
y = X %*% beta + rnorm(n)
knockoffs = function(X) create.gaussian(X, mu, Sigma)

# Basic usage with default arguments
result = knockoff.filter(X, y, knockoffs=knockoffs, 
                           statistic=stat.lasso_lambdadiff)
print(result$selected)

# Advanced usage with custom arguments
foo = stat.lasso_lambdadiff
k_stat = function(X, X_k, y) foo(X, X_k, y, nlambda=200)
result = knockoff.filter(X, y, knockoffs=knockoffs, statistic=k_stat)
print(result$selected)

[Package knockoff version 0.3.6 Index]