rk_subsample {kko} | R Documentation |
compute selection frequency of rk_fit on subsamples
Description
The function applys rk_fit on subsamples and record selection results.
Usage
rk_subsample(
X,
y,
X_k,
rfn,
n_stb,
cv_folds,
frac_stb = 1/2,
nCores_para,
rkernel = "laplacian",
rk_scale = 1
)
Arguments
X |
design matrix of additive model; rows are observations and columns are variables. |
y |
response of addtive model. |
X_k |
knockoffs matrix of design; the same size as X. |
rfn |
random feature expansion number. |
n_stb |
number of subsampling. |
cv_folds |
the folds of cross-validation for tuning group lasso. |
frac_stb |
fraction of subsample size. |
nCores_para |
number of cores for parallelizing subsampling. |
rkernel |
kernel choices. Default is "laplacian". Other choices are "cauchy" and "gaussian". |
rk_scale |
scaling parameter of sampling distribution for random feature expansion. For gaussian kernel, it is standard deviation of gaussian sampling distribution. |
Value
a 0/1 matrix indicating selection results. Rows are subsamples, and columns are variables. The first half columns are variables of design X, and the latter are knockoffs X_k.
Author(s)
Xiaowu Dai, Xiang Lyu, Lexin Li
Examples
library(knockoff)
p=5 # number of predictors
sig_mag=100 # signal strength
n= 100 # sample size
rkernel="laplacian" # kernel choice
s=2 # sparsity, number of nonzero component functions
rk_scale=1 # scaling paramtere of kernel
rfn= 3 # number of random features
cv_folds=15 # folds of cross-validation in group lasso
n_stb=10 # number of subsampling
frac_stb=1/2 # fraction of subsample
nCores_para=2 # number of cores for parallelization
X=matrix(rnorm(n*p),n,p)%*%chol(toeplitz(0.3^(0:(p-1)))) # generate design
X_k = create.second_order(X) # generate knockoff
reg_coef=c(rep(1,s),rep(0,p-s)) # regression coefficient
reg_coef=reg_coef*(2*(rnorm(p)>0)-1)*sig_mag
y=X%*% reg_coef + rnorm(n) # response
rk_subsample(X,y,X_k,rfn,n_stb,cv_folds,frac_stb,nCores_para,rkernel,rk_scale)