chemoDivCheck {chemodiv} | R Documentation |
Check data formatting
Description
Function to check that the datasets used by other functions in the chemodiv package are correctly formatted.
Usage
chemoDivCheck(sampleData, compoundData)
Arguments
sampleData |
Data frame with the relative concentration of each compound (column) in every sample (row). |
compoundData |
Data frame with the compounds in |
Details
The function performs a number of checks on the two main datasets used as input data, to make sure datasets are formatted in a way suitable for the other functions in the package. This should make it easier for users to correctly construct datasets before starting with analyses.
Two datasets are needed to use the full set of analyses included in
the package, and these can be checked for formatting issues.
The first dataset should contain data on the proportions
of different compounds (columns) in different samples (rows).
Note that all calculations of diversity, and most calculations of
dissimilarity, are only performed on relative, rather than absolute,
values. The second dataset should contain, in each of three
columns in a data frame, the compound name, SMILES and InChIKey IDs of
all the compounds present in the first dataset. See
chemodiv
for details on obtaining SMILES and InChIKey IDs.
Avoid including Greek letters or other special characters in the
compound names.
Value
One or several messages pointing out problems with data formatting, or a message informing that the datasets appear to be correctly formatted.
Examples
data(minimalSampData)
data(minimalCompData)
chemoDivCheck(minimalSampData, minimalCompData) # Correct format
chemoDivCheck(minimalSampData, minimalCompData[c(2,3,1),]) # Incorrect format
data(alpinaSampData)
data(alpinaCompData)
chemoDivCheck(sampleData = alpinaSampData, compoundData = alpinaCompData)