miRNA_classifierSetup {MiRNAQCD} | R Documentation |
Analysis of features and training of classifiers.
Description
This function carries out different tasks depending on the input parameters: –> Analysis mode: analyzes the properties of each miRNA (possibly subtracting a normalizer) in terms of Target/Versus separation, normality, etc. A matrix of correlation coefficients between each pair of miRNAs is also assessed. –> Training mode: trains a Bayesian classifier by assessing the corresponding diagnostic threshold values and the related uncertainties.
Usage
miRNA_classifierSetup(
inputDataset,
inputTargetList,
inputVersusList = character(),
inputMiRNAList = character(),
coeffList = double(),
saveOutputFile = FALSE,
outputFileBasename = "",
sep = "\t",
plotFormat = "pdf",
scorePlotAscending = TRUE,
scorePlotParameters = character(),
histogramParameters = character(),
colorComplementFlag = FALSE
)
Arguments
inputDataset |
Dataset (data frame) to be used for the analysis/training. The data frame must comply with the output format of the quality control functions (miRNA_expressionPreprocessing and miRNA_removeOutliers), thus containing the columns 'Subject', 'miRNA', 'Mean', 'StdDev', 'SampleSize', 'Class'. Any other column is ignored, and any missing column forbids execution. Please note that in this case the 'Class' column is mandatory. |
inputTargetList |
List of classes to use as target for the classification. The chosen target must correspond to at least one of the classes present in the 'Class' column of the inputDataset. |
inputVersusList |
List of classes to use as versus for the classification. If the argument is left empty, all classes present in the 'Class' column of the inputDataset, minus the Target classes, are used as Versus. |
inputMiRNAList |
List of miRNAs to be used by the classifier ('Training mode'). The chosen miRNAs must be present in the 'miRNA' column of the inputDataset. In 'Analysis mode', this argument has to be omitted (if no normalizer has to be used) or has to contain a single entry (corresponding to the miRNA to be used as normalizer). |
coeffList |
List of coefficients for the classifier. In 'Training mode', the number of coefficients must be the same as the number of used miRNAs and listed in the same order. In 'Analysis mode', this argument has to be omitted. |
saveOutputFile |
Boolean option setting whether results are written to file (TRUE) or not (FALSE). Default is FALSE. |
outputFileBasename |
Name of the output file where the classifier setup results ('Training mode') or the analysis results ('Analysis mode') are to be stored. If not assigned, a filename is automatically generated. File names of other files created by the function are generated by appending suitable labels to the provided "outputFileBasename". |
sep |
Field separator character for the output files; the default is tabulation. |
plotFormat |
String specifying the format of generated graphic files (plots): can either be "pdf" (default) or "png". |
scorePlotAscending |
Boolean option to set the direction in which samples are ordered: TRUE corresponds to samples ordered by ascending score, FALSE corresponds to samples ordered by descending score. Default is TRUE. This argument is meaningful only if saveOutputFile is set to TRUE and the function is running in 'Training mode'. |
scorePlotParameters |
String specifying the y-axis parameters of the score plot. If empty, the axis is configured by assessing suitable parameters from the data. This argument is meaningful only if saveOutputFile is set to TRUE and the function is running in 'Training mode'. The string has to comply with the format "yl_yu_yt", where: yl is the lower y limit; yu is the upper y limit; yt is the interval between tics along the axis. |
histogramParameters |
String specifying the parameters used to build histograms. If empty, histograms are built by assessing suitable parameters from the data. This parameter is meaningful only if saveOutputFile is set to TRUE. The string has to comply with the following format: "xl_xu_bw", where xl is the lower boundary of the leftmost bin; xu is the upper boundary of the rightmost bin; bw is the bin width. |
colorComplementFlag |
Boolean option to switch between the default palette (FALSE) and its inverted version (TRUE). Default is FALSE, corresponding to target samples reported in blue and versus samples in red. This argument is meaningful only if saveOutputFile is set to TRUE. Beware! Cross-correlation coefficients, as well as Shapiro-Wilk tests for normality, require at least three data samples. In case of less than three samples, those tests are skipped and "NA" (not available) is reported in the corresponding output. |
Details
In order to select between Analysis and Training mode, the input parameters "inputMiRNAList" and "coeffList" have to comply with the following requirements. –> Analysis mode: "coeffList" has to be empty (i.e. omitted in the function call arguments). "inputMiRNAList" can either be empty (i.e. omitted in the function call arguments) or of length 1: in the latter case, the single entry of "inputMiRNAList" is assumed to be the normalizer. –> Training mode: "inputMiRNAList" and "coeffList" have to be non-empty and of the same size.
Value
In 'Analysis mode', a data frame containing the columns 'miRNA', 'Diagnosis', 'NumberOfSubjects', 'Mean', 'StdDev', 'NormalityTest', 't-test'. In 'Training mode', a data frame containing the columns 'Threshold', 'DeltaThreshold', 'DPrime', 'Pc', 'ChiUp', 'DChiUp', 'ChiDown', 'DChiDown', 'Accuracy', 'DAccuracy', 'Specificity', 'Sensitivity', 'F1-score', 'DPrime', 'AUC', 'AUCDown', 'AUCUp', 't-test', 'NormalityTest-target', 'NormalityTest-versus'.
Examples
requiredFile = paste(system.file(package="MiRNAQCD"),
"/extdata/test_dataset_alpha_clean.dat", sep='')
myDataFrame <- read.table(file=requiredFile, header=TRUE)
Target <- c("A")
Versus <- c("B", "C")
## Analysis mode
miRNAstats <- miRNA_classifierSetup(myDataFrame, Target, Versus)
## Analysis mode, with normalizer
miRNAstats <- miRNA_classifierSetup(myDataFrame, Target, Versus, c("FZ"))
## Training mode
mirnaToUse <- c("FX", "FZ")
coefficientsToUse <- c(1.0, -1.0)
threshold <- miRNA_classifierSetup(myDataFrame, Target, Versus,
mirnaToUse, coefficientsToUse)