ds_method {HDDesign} | R Documentation |
Estimate PCC by DS Method
Description
Determine the probability of correct classification (PCC) for studies employing high dimensional features for classification; uses the method proposed by (Dobbin and Simon 2007) to choose the p-value threshold for feature selection.
Usage
ds_method(mu0, p, m, n, p1=0.5, lmax=1, ss=F, sampling.p)
Arguments
mu0 |
The effect size of the important features. |
p |
The number of the features in total. |
m |
The number of the important features. |
n |
The total sample size for the two groups. |
p1 |
The prevalence of the group 1 in the population, default to 0.5. |
lmax |
The maximum eigenvalue of the variance-covariance matrix of the p features. Defaults to 1 which implies that the features are assumed i.i.d. |
ss |
Boolean variable, default to FALSE. The TRUE value instruct the program to compute the sensitivity and the specificity of the classifier. |
sampling.p |
The assumed proportion of group 1 samples in the training data; default of 0.5 assumes groups are equally represented regardless of p1. |
Details
Refer to Dobbin and Simon (2007)
Value
If ss=FALSE, the function returns the expected PCC. If ss=TRUE, the function returns a vector containing the expected PCC, sensitivity and specificity.
Author(s)
Meihua Wu <meihuawu@umich.edu> Brisa N. Sanchez <brisa@umich.edu> Peter X.K. Song <pxsong@umich.edu> Raymond Luu <raluu@umich.edu> Wen Wang <wangwen@umich.edu>
References
Dobbin, K.K., and Simon R.M. (2007). "Sample Size Planning for Developing Classifiers Using High-dimensional DNA Microarray Data." Biostatistics 8 (1): 101-117.
Examples
ds_method(mu0=0.6, p=500, m=10, n=38, p1=0.5, lmax=1, ss=TRUE)
#[1] 0.9252471 0.9252471 0.9252471