CAS {MFSIS} | R Documentation |
Category-Adaptive Variable Screening for Ultra-High Dimensional Heterogeneous Categorical Data
Description
A category-adaptive screening procedure with high-dimensional heterogeneous data, which is to detect category-specific important covariates. This proposal is a model-free approach without any specification of a regression model and an adaptive procedure in the sense that the set of active variables is allowed to vary across different categories, thus making it more flexible to accommodate heterogeneity.
Usage
CAS(X, Y, nsis)
Arguments
X |
The design matrix of dimensions n * p. Each row is an observation vector. |
Y |
The response vector of dimension n * 1. |
nsis |
Number of predictors recruited by CAS. The default is n/log(n). |
Value
the labels of first nsis largest active set of all predictors
Author(s)
Xuewei Cheng xwcheng@hunnu.edu.cn
References
Pan, R., Wang, H., and Li, R. (2016). Ultrahigh-dimensional multiclass linear discriminant analysis by pairwise sure independence screening. Journal of the American Statistical Association, 111(513):169–179.
Examples
n <- 100
p <- 200
rho <- 0.5
data <- GendataLGM(n, p, rho)
data <- cbind(data[[1]], data[[2]])
colnames(data)[1:ncol(data)] <- c(paste0("X", 1:(ncol(data) - 1)), "Y")
data <- as.matrix(data)
X <- data[, 1:(ncol(data) - 1)]
Y <- data[, ncol(data)]
A <- CAS(X, Y, n / log(n))
A