samplesize {planningML} | R Documentation |
Sample size determination
Description
This function determine the optimal sample size based on the performance evaluation metric and number of selected features.
Usage
samplesize(
features = NULL,
sample.size = seq(10, 1000, 20),
method = "HCT",
m = NULL,
effectsize = NULL,
class.prob = NULL,
totalnum_features = NULL,
threshold = 0.1,
metric = "MCC",
target = NULL
)
Arguments
features |
feature selection results from the featureselection function in the package. |
sample.size |
sample size grid |
method |
default is HCT method, sample size dependent performance metric based on HCT method (HCT) or DS method (DS). |
m |
the number of features involved in the sample size determination. Default is NULL, which means the number of features are determined by the featureselection results based on the iHCT method. Otherwise, users can select the number based on their needs. The self-defined m should be smaller than the optimal number of features determined by the featureselection function. |
effectsize |
common effect size the the m features. NULL means the effect size is directly calculated from the data. Users can also provide the effect sizes based on historical data. |
class.prob |
probability of the event |
totalnum_features |
total number of features |
threshold |
default = 0.1. Threshold needed to determine the sample size. |
metric |
default = "MCC". The target performance estimation metric that you want to optimize. Other choices can be AUC. |
target |
target MCC/AUC that you want to achieve |
Value
samplesize()
returns sample size needed to achieve corresponding performance measurements.