stat.glmnet_lambdadiff {knockoff} | R Documentation |
Importance statistics based on a GLM
Description
Fits a generalized linear model via penalized maximum likelihood and computes the difference statistic
where and
are the maximum values of the
regularization parameter
at which the jth variable
and its knockoff enter the model, respectively.
Usage
stat.glmnet_lambdadiff(X, X_k, y, family = "gaussian", ...)
Arguments
X |
n-by-p matrix of original variables. |
X_k |
n-by-p matrix of knockoff variables. |
y |
vector of length n, containing the response variables. Quantitative for family="gaussian", or family="poisson" (non-negative counts). For family="binomial" should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class). For family="multinomial", can be a nc>=2 level factor, or a matrix with nc columns of counts or proportions. For either "binomial" or "multinomial", if y is presented as a vector, it will be coerced into a factor. For family="cox", y should be a two-column matrix with columns named 'time' and 'status'. The latter is a binary variable, with '1' indicating death, and '0' indicating right censored. The function Surv() in package survival produces such a matrix. For family="mgaussian", y is a matrix of quantitative responses. |
family |
response type (see above). |
... |
additional arguments specific to |
Details
This function uses glmnet
to compute the regularization path
on a fine grid of 's.
The nlambda
parameter can be used to control the granularity of the
grid of 's. The default value of
nlambda
is 500
.
If the family is 'binomial' and a lambda sequence is not provided by the user, this function generates it on a log-linear scale before calling 'glmnet'.
The default response family is 'gaussian', for a linear regression model. Different response families (e.g. 'binomial') can be specified by passing an optional parameter 'family'.
For a complete list of the available additional arguments, see glmnet
.
Value
A vector of statistics of length p.
See Also
Other statistics:
stat.forward_selection()
,
stat.glmnet_coefdiff()
,
stat.lasso_coefdiff_bin()
,
stat.lasso_coefdiff()
,
stat.lasso_lambdadiff_bin()
,
stat.lasso_lambdadiff()
,
stat.random_forest()
,
stat.sqrt_lasso()
,
stat.stability_selection()
Examples
set.seed(2022)
p=200; n=100; k=15
mu = rep(0,p); Sigma = diag(p)
X = matrix(rnorm(n*p),n)
nonzero = sample(p, k)
beta = 3.5 * (1:p %in% nonzero)
y = X %*% beta + rnorm(n)
knockoffs = function(X) create.gaussian(X, mu, Sigma)
# Basic usage with default arguments
result = knockoff.filter(X, y, knockoffs=knockoffs,
statistic=stat.glmnet_lambdadiff)
print(result$selected)
# Advanced usage with custom arguments
foo = stat.glmnet_lambdadiff
k_stat = function(X, X_k, y) foo(X, X_k, y, nlambda=200)
result = knockoff.filter(X, y, knockoffs=knockoffs, statistic=k_stat)
print(result$selected)