KPCRKHS_VS {KPC}R Documentation

Variable selection with RKHS estimator

Description

The algorithm performs a forward stepwise variable selection using RKHS estimators.

Usage

KPCRKHS_VS(
  Y,
  X,
  num_features,
  ky = kernlab::rbfdot(1/(2 * stats::median(stats::dist(Y))^2)),
  kS = NULL,
  eps = 0.001,
  appro = FALSE,
  tol = 1e-05,
  numCores = parallel::detectCores(),
  verbose = FALSE
)

Arguments

Y

a matrix of responses (n by dy)

X

a matrix of predictors (n by dx)

num_features

the number of variables to be selected, cannot be larger than dx.

ky

a function k(y, y') of class kernel. It can be the kernel implemented in kernlab e.g., Gaussian kernel: rbfdot(sigma = 1), linear kernel: vanilladot()

kS

a function that takes X and a subset of indices S as inputs, and then outputs the kernel for X_S. The first argument of kS is X, and the second argument is a vector of positive integer. If kS == NULL, Gaussian kernel with empitical bandwidth kernlab::rbfdot(1/(2*stats::median(stats::dist(X[,S]))^2)) will be used.

eps

a positive number; the regularization parameter for the RKHS estimator

appro

whether to use incomplete Cholesky decomposition for approximation

tol

tolerance used for incomplete Cholesky decomposition (inchol in package kernlab)

numCores

number of cores that are going to be used for parallelizing the process.

verbose

whether to print each selected variables during the forward stepwise algorithm

Details

A stepwise forward selection of variables using KPC. At each step it selects the X_j that maximizes \tilde{\rho^2}(Y,X_j |selected X_i). It is suggested to normalize the features before applying the algorithm.

Value

The algorithm returns a vector of the indices from 1,...,dx of the selected variables in the same order that they were selected. The variables at the front are expected to be more informative in predicting Y.

See Also

KPCgraph, KPCRKHS, KFOCI

Examples

n = 200
p = 10
X = matrix(rnorm(n * p), ncol = p)
Y = X[, 1] * X[, 2] + sin(X[, 1] * X[, 3])
library(kernlab)
kS = function(X,S) return(rbfdot(1/length(S)))
KPCRKHS_VS(Y, X, num_features = 3, rbfdot(1), kS, eps = 1e-3, appro = FALSE, numCores = 1)
kS = function(X,S) return(rbfdot(1/(2*stats::median(stats::dist(X[,S]))^2)))
KPCRKHS_VS(Y, X, num_features = 3, rbfdot(1), kS, eps = 1e-3, appro = FALSE, numCores = 1)

[Package KPC version 0.1.2 Index]