GendataLDA {MFSIS} | R Documentation |
Generate simulation data (Categorial based on linear discriminant analysis model)
Description
Simulates a dataset that can be used to filter out features for ultrahigh-dimensional discriminant analysis. The simulation is based on the balanced scenarios in Example 3.1 of Cui et al.(2015). The simulated dataset has p numerical X-predictors and a categorical Y-response.
Usage
GendataLDA(
n,
p,
R = 3,
error = c("gaussian", "t", "cauchy"),
style = c("balanced", "unbalanced")
)
Arguments
n |
Number of subjects in the dataset to be simulated. It will also equal to the number of rows in the dataset to be simulated, because it is assumed that each row represents a different independent and identically distributed subject. |
p |
Number of predictor variables (covariates) in the simulated dataset. These covariates will be the features screened by model-free procedures. |
R |
A positive integer, number of outcome categories for multinomial (categorical) outcome Y. |
error |
The distribution of error term, you can choose "gaussian" to generate a normal distribution of error or you choose "t" to generate a t distribution of error with degree=2. "cauchy" is represent the error term with cauchy distribution. |
style |
The balance among categories in categorial data . |
Value
the list of your simulation data
Author(s)
Xuewei Cheng xwcheng@hunnu.edu.cn
References
Cui, H., Li, R., & Zhong, W. (2015). Model-free feature screening for ultrahigh dimensional discriminant analysis. Journal of the American Statistical Association, 110(510), 630-641.
Examples
n <- 100
p <- 200
R <- 3
data <- GendataLDA(n, p, R, error = "gaussian", style = "balanced")