stat.forward_selection {knockoff} | R Documentation |
Importance statistics based on forward selection
Description
Computes the statistic
W_j = \max(Z_j, Z_{j+p}) \cdot \mathrm{sgn}(Z_j - Z_{j+p}),
where Z_1,\dots,Z_{2p}
give the reverse order in which the 2p
variables (the originals and the knockoffs) enter the forward selection
model.
See the Details for information about forward selection.
Usage
stat.forward_selection(X, X_k, y, omp = F)
Arguments
X |
n-by-p matrix of original variables. |
X_k |
n-by-p matrix of knockoff variables. |
y |
numeric vector of length n, containing the response variables. |
omp |
whether to use orthogonal matching pursuit (default: F). |
Details
In forward selection, the variables are chosen iteratively to maximize
the inner product with the residual from the previous step. The initial
residual is always y
. In standard forward selection
(stat.forward_selection
), the next residual is the remainder after
regressing on the selected variable; when orthogonal matching pursuit
is used, the next residual is the remainder
after regressing on all the previously selected variables.
Value
A vector of statistics W
of length p.
See Also
Other statistics:
stat.glmnet_coefdiff()
,
stat.glmnet_lambdadiff()
,
stat.lasso_coefdiff_bin()
,
stat.lasso_coefdiff()
,
stat.lasso_lambdadiff_bin()
,
stat.lasso_lambdadiff()
,
stat.random_forest()
,
stat.sqrt_lasso()
,
stat.stability_selection()
Examples
set.seed(2022)
p=100; n=100; k=15
mu = rep(0,p); Sigma = diag(p)
X = matrix(rnorm(n*p),n)
nonzero = sample(p, k)
beta = 3.5 * (1:p %in% nonzero)
y = X %*% beta + rnorm(n)
knockoffs = function(X) create.gaussian(X, mu, Sigma)
# Basic usage with default arguments
result = knockoff.filter(X, y, knockoffs=knockoffs,
statistic=stat.forward_selection)
print(result$selected)
# Advanced usage with custom arguments
foo = stat.forward_selection
k_stat = function(X, X_k, y) foo(X, X_k, y, omp=TRUE)
result = knockoff.filter(X, y, knockoffs=knockoffs, statistic=k_stat)
print(result$selected)