nbldaControl {NBLDA} | R Documentation |
Control parameters for trained NBLDA model.
Description
Define control parameters to be used within trainNBLDA
function.
Usage
nbldaControl(folds = 5, repeats = 2, foldIdx = NULL, rhos = NULL,
beta = 1, prior = NULL, transform = FALSE, alpha = NULL, truephi = NULL,
target = 0, phi.epsilon = 0.15, normalize.target = FALSE, delta = NULL,
multicore = FALSE, ...)
Arguments
folds |
A positive integer. The number of folds for k-fold model validation. |
repeats |
A positive integer. This is the number of repeats for k-fold model validation. If NULL, 0 or negative, it is set to 1. |
foldIdx |
a list with indices of hold-out samples for each fold. It should be a list where folds are nested within repeats. If NULL, |
rhos |
A vector of tuning parameters that control the amount of soft thresholding performed. If NULL, it is automatically generated within |
beta |
A smoothing term. A Gamma(beta,beta) prior is used to fit the Poisson model. Recommendation is to just leave it at 1, the default value. See Witten (2011) and Dong et al. (2016) for details. |
prior |
A vector with a length equal to the number of classes indicates the prior class probabilities. If NULL, all classes are assumed to be equally distributed. |
transform |
a logical. If TRUE, count data is transformed using power transformation. If |
alpha |
a numeric value within [0, 1] to be used for power transformation. |
truephi |
a vector with a length equal to the number of variables. Its elements represent the true overdispersion parameters for each variable. If a single value is given, it is recycled for all variables. If a vector whose length is not equal to the number of variables given, the first element of this vector is used and recycled for all variables. If NULL, estimated overdispersions are used in the classifier. See details. |
target |
a value for the shrinkage target of dispersion estimates. If NULL, then then a value that is small and minimizes the average squared difference is automatically used as the target value. See |
phi.epsilon |
a positive value for controlling the number of features whose dispersions are shrinked towards 0. See details. |
normalize.target |
a logical. If TRUE and |
delta |
a weight within the interval [0, 1] that is used while shrinking dispersions towards 0. When "delta = 0", initial dispersion estimates are forced to be shrunk to 1. Similarly, if "delta = 0", no shrinkage is performed on the initial estimates. |
multicore |
a logical. If a parallel backend is loaded and available, the function runs in parallel setting for speeding up the computations. |
... |
further arguments passed to |
Details
rhos
is used to control the level of sparsity, i.e., the number of variables (or features) used in the classifier. If a variable has no contribution to the discrimination function, it should be removed from the model. By setting rhos within the interval [0, Inf], it is possible to control the number of variables that are removed from the model. As the upper bound of rhos decreases towards 0, fewer variables are removed. If rhos = 0
, all variables are included in the classifier.
truephi
controls how the Poisson model differs from the Negative Binomial model. If overdispersion is zero, the Negative Binomial model converges to the Poisson model. Hence, the results from trainNBLDA
are identical to PLDA results from Classify
when truephi = 0.
phi.epsilon
is a value used to shrink estimated overdispersions towards 0. The Poisson model assumes that there is no overdispersion in the observed counts. However, this is not a valid assumption in highly overdispersed count data. NBLDA
performs a shrinkage on estimated overdispersions. Although the amount of shrinkage is dependent on several parameters such as delta
, target
, and truephi
, some of the shrunken overdispersions might be very close to 0. By defining a threshold value for shrunken overdispersions, it is possible to shrink very small overdispersions towards 0. If estimated overdispersion is below phi.epsilon
, it is shrunken to 0. If phi.epsilon
= NULL, threshold value is set to 0. Hence, all the variables with very small overdispersion are included in the NBLDA model.
Value
a list with all the control elements.
Author(s)
Dincer Goksuluk
References
Witten, DM (2011). Classification and clustering of sequencing data using a Poisson model. Ann. Appl. Stat. 5(4), 2493–2518. doi:10.1214/11-AOAS493.
Dong, K., Zhao, H., Tong, T., & Wan, X. (2016). NBLDA: negative binomial linear discriminant analysis for RNA-Seq data. BMC Bioinformatics, 17(1), 369. http://doi.org/10.1186/s12859-016-1208-1.
Yu, D., Huber, W., & Vitek, O. (2013). Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size. Bioinformatics, 29(10), 1275-1282.
See Also
Examples
nbldaControl() # return default control parameters.