cdcsis {cdcsis}R Documentation

Conditional Distance Correlation Sure Independence Screening (CDC-SIS)

Description

Performs conditional distance correlation sure independence screening (CDC-SIS).

Usage

cdcsis(x, y, z = NULL, width, threshold = nrow(y), distance = FALSE,
  index = 1, num.threads = 1)

Arguments

x

a numeric matrix, or a list which contains multiple numeric matrix

y

a numeric vector, matrix, or dist object

z

z is a numeric vector or matrix. It is the variable being conditioned.

width

a user-specified positive value (univariate conditional variable) or vector (multivariate conditional variable) for gaussian kernel bandwidth. Its default value is relies on stats::bw.nrd0 function when conditional variable is univariate, ks::Hpi.diag when conditional variable with at most trivariate, and stats::bw.nrd on the other cases.

threshold

the threshold of the number of predictors recuited by CDC-SIS. Should be less than or equal than the number of column of x. Default value threshold is sample size.

distance

if distance = TRUE, only y will be considered as distance matrices. Default: distance = FALSE

index

exponent on Euclidean distance, in (0,2]

num.threads

number of threads. Default num.threads = 1.

Value

ix

the vector of indices selected by CDC-SIS

cdcor

the conditional distance correlation for each univariate/multivariate variable in x

Author(s)

Canhong Wen, Wenliang Pan, Mian Huang, and Xueqin Wang

References

Wen, C., Pan, W., Huang, M. and Wang, X., 2018. Sure independence screening adjusted for confounding covariates with ultrahigh-dimensional data. Statistica Sinica, 28, pp.293-317. URL http://www3.stat.sinica.edu.tw/statistica/J28N1/28-1.html

See Also

cdcor

Examples

## Not run: 

library(cdcsis)

########## univariate explanative variables ##########
set.seed(1)
num <- 100
p <- 150
x <- matrix(rnorm(num * p), nrow = num)
z <- rnorm(num)
y <- 3 * x[, 1] + 1.5 * x[, 2] + 4 * z * x[, 5] + rnorm(num)
res <- cdcsis(x, y, z)
head(res[["ix"]], n = 10)

########## multivariate explanative variables ##########
x <- as.list(as.data.frame(x))
x <- lapply(x, as.matrix)
x[[1]] <- cbind(x[[1]], x[[2]])
x[[2]] <- NULL
res <- cdcsis(x, y, z)
head(res[["ix"]], n = 10)

########## multivariate response variables ##########
num <- 100
p <- 150
x <- matrix(rnorm(num * p), nrow = num)
z <- rnorm(num)
y1 <- 3 * x[, 1] + 5 * z * x[, 4] + rnorm(num)
y2 <- 3 * x[, 2] + 5 * x[, 3] + 2 * z + rnorm(num)
y <- cbind(y1, y2)
res <- cdcsis(x, y, z)
head(res[["ix"]], n = 10)

## End(Not run)

[Package cdcsis version 2.0.3 Index]