miRNA_diagnosis {MiRNAQCD}R Documentation

Classification of a dataset (diagnosis).

Description

This function classifies the entries of the input dataset as either target or versus by using the chosen classifier and given the corresponding disgnostic threshold value.

Usage

miRNA_diagnosis(
  inputDataset,
  inputMiRNAList,
  coeffList,
  inputThreshold,
  inputTargetList = character(),
  inputVersusList = character(),
  saveOutputFile = FALSE,
  outputFileBasename = "",
  sep = "\t",
  plotFormat = "pdf",
  scorePlotParameters = character(),
  scorePlotAscending = TRUE,
  colorComplementFlag = FALSE,
  histogramParameters = character()
)

Arguments

inputDataset

Dataset (data frame) to be classified. The data frame must comply with the output format of the quality control functions (miRNA_expressionPreprocessing and miRNA_removeOutliers), thus containing the columns 'Subject', 'miRNA', 'Mean', 'StdDev', 'SampleSize'. Any other column is ignored, and any missing column forbids execution. If the 'Performance analysis mode' is selected (see inputTargetList), the dataset has to contain the 'Class' column as well.

inputMiRNAList

List of miRNAs to be used by the classifier. The chosen miRNAs must be present in the 'miRNA' column of the inputDataset.

coeffList

List of coefficients for the classifier. The number of coefficients must be the same as the number of used miRNAs and listed in the same order.

inputThreshold

Diagnostic threshold data frame for the classifier. The data frame must comply with the output format of the classifier setup function (miRNA_classifierSetup), thus containing the columns 'Threshold', 'DeltaThreshold', 'ChiUp', 'DChiUp', 'ChiDown', 'DChiDown'. Any other column is ignored.

inputTargetList

List of classes to use as target. Providing this argument corresponds to selecting the 'Performance analysis mode'. Consequently, inputDataset is expected to contain the 'Class' column as well. The chosen target must correspond to at least one of the classes present in the 'Class' column of the inputDataset.

inputVersusList

List of classes to use as versus in 'Performance analysis mode'. If the argument is left empty, all classes present in the 'Class' column of the inputDataset, minus the Target classes, are used as Versus.

saveOutputFile

Boolean option setting whether results are written to file (TRUE) or not (FALSE). Default is FALSE.

outputFileBasename

Name of the output file where the diagnosis results are to be stored. If not assigned, a filename is automatically generated.

sep

Field separator character for the output file; the default is tabulation.

plotFormat

String specifying the format of generated graphic files (plots): can either be "pdf" (default) or "png".

scorePlotParameters

String specifying the y-axis parameters of the score plot. If empty, the axis is configured by assessing suitable parameters from the data. This argument is meaningful only if saveOutputFile is set to TRUE. The string has to comply with the format "yl_yu_yt", where: yl is the lower y limit; yu is the upper y limit; yt is the interval between tics along the axis.

scorePlotAscending

Boolean option to set the direction in which samples are ordered: TRUE corresponds to samples ordered by ascending score, FALSE corresponds to samples ordered by descending score. Default is TRUE. This argument is meaningful only if saveOutputFile is set to TRUE.

colorComplementFlag

Boolean option to switch between the default palette (FALSE) and its inverted version (TRUE). Default is FALSE, corresponding to target samples reported in blue and versus samples in red. This argument is meaningful only if saveOutputFile is set to TRUE.

histogramParameters

(Used in 'Performance analysis mode' only). String specifying the parameters used to build the histogram. If empty, the histogram is built by assessing suitable parameters from the data. This parameter is meaningful only if saveOutputFile is set to TRUE. The string has to comply with the format "xl_xu_bw", where: xl is the lower boundary of the leftmost bin; xu is the upper boundary of the rightmost bin; bw is the bin width.

Details

This function can also run in 'Performance analysis mode' to evaluate the performance of a classifier by running it on an already-classified dataset. In order to carry out performance analysis, inputDataset has to contain a 'Class' column. Moreover, a list of Target classes has to be provided to the function via the inputTargetList argument.

Value

A data frame containing the columns 'Subject', 'Diagnosis' and 'Score'.

Please refer to the user manual installed in "/path-to-library/MiRNAQCD/doc/manual.pdf" for detailed function documentation. The path "/path-to-library" can be shown from R by calling ".libPaths()"

Examples

requiredDataFile = paste(system.file(package="MiRNAQCD"),
		"/extdata/test_dataset_beta_clean.dat", sep='')
myDataFrame <- read.table(file=requiredDataFile, header=TRUE)
requiredThresholdFile = paste(system.file(package="MiRNAQCD"),
		"/extdata/test_dataset_alpha_threshold.txt", sep='')
thresholdDataFrame <- read.table(file=requiredThresholdFile, header=TRUE)
mirnaToUse <- c("FX", "FZ")
coefficientsToUse <- c(1.0, -1.0)

## Classification
classifiedDataset <- miRNA_diagnosis(myDataFrame, mirnaToUse, coefficientsToUse,
				thresholdDataFrame)

[Package MiRNAQCD version 1.1.3 Index]