kko {kko}R Documentation

variable selection for additive model via KKO

Description

The function applys KKO to compute importance scores of components.

Usage

kko(
  X,
  y,
  X_k,
  rfn_range = c(2, 3, 4),
  n_stb_tune = 50,
  n_stb = 100,
  cv_folds = 10,
  frac_stb = 1/2,
  nCores_para = 4,
  rkernel = c("laplacian", "gaussian", "cauchy"),
  rk_scale = 1
)

Arguments

X

design matrix of additive model; rows are observations and columns are variables.

y

response of addtive model.

X_k

knockoffs matrix of design; the same size as X.

rfn_range

a vector of random feature expansion numbers to be tuned.

n_stb_tune

number of subsampling for tuning random feature numbers.

n_stb

number of subsampling for computing importance scores.

cv_folds

the folds of cross-validation for tuning group lasso penalty.

frac_stb

fraction of subsample size.

nCores_para

number of cores for parallelizing subsampling.

rkernel

kernel choices. Default is "laplacian". Other choices are "cauchy" and "gaussian".

rk_scale

scale parameter of sampling distribution for random feature expansion. For gaussian kernel, it is standard deviation of gaussian sampling distribution.

Value

a list of selection results.

importance_score importance scores of variables for knockoff filtering.
selection_frequency a 0/1 matrix of selection results on subsamples. Rows are subsamples, and columns are variables. The first half columns are variables of design X, and the latter are knockoffs X_k
rfn_tune tuned optimal random feature number.
rfn_range range of random feature numbers.
tune_result a list of tuning results.

Author(s)

Xiaowu Dai, Xiang Lyu, Lexin Li

Examples

library(knockoff)
p=4 # number of predictors
sig_mag=100 # signal strength
n= 100 # sample size
rkernel="laplacian" # kernel choice
s=2  # sparsity, number of nonzero component functions
rk_scale=1  # scaling paramtere of kernel
rfn_range=c(2,3,4)  # number of random features
cv_folds=15  # folds of cross-validation in group lasso
n_stb=10 # number of subsampling for importance scores
n_stb_tune=5 # number of subsampling for tuning random feature number
frac_stb=1/2 # fraction of subsample
nCores_para=2 # number of cores for parallelization
X=matrix(rnorm(n*p),n,p)%*%chol(toeplitz(0.3^(0:(p-1))))   # generate design
X_k = create.second_order(X) # generate knockoff
reg_coef=c(rep(1,s),rep(0,p-s))  # regression coefficient
reg_coef=reg_coef*(2*(rnorm(p)>0)-1)*sig_mag
y=X%*% reg_coef + rnorm(n) # response

kko(X,y,X_k,rfn_range,n_stb_tune,n_stb,cv_folds,frac_stb,nCores_para,rkernel,rk_scale)




[Package kko version 1.0.1 Index]