R: Check data formatting

chemoDivCheck {chemodiv}

R Documentation

Check data formatting

Description

Function to check that the datasets used by other functions in the chemodiv package are correctly formatted.

Usage

chemoDivCheck(sampleData, compoundData)

Arguments

`sampleData`	Data frame with the relative concentration of each compound (column) in every sample (row).
`compoundData`	Data frame with the compounds in `sampleData` as rows. Should have a column named "compound" with common names of the compounds, a column named "smiles" with SMILES IDs of the compounds, and a column named "inchikey" with the InChIKey IDs for the compounds.

Details

The function performs a number of checks on the two main datasets used as input data, to make sure datasets are formatted in a way suitable for the other functions in the package. This should make it easier for users to correctly construct datasets before starting with analyses.

Two datasets are needed to use the full set of analyses included in the package, and these can be checked for formatting issues. The first dataset should contain data on the proportions of different compounds (columns) in different samples (rows). Note that all calculations of diversity, and most calculations of dissimilarity, are only performed on relative, rather than absolute, values. The second dataset should contain, in each of three columns in a data frame, the compound name, SMILES and InChIKey IDs of all the compounds present in the first dataset. See chemodiv for details on obtaining SMILES and InChIKey IDs. Avoid including Greek letters or other special characters in the compound names.

Value

One or several messages pointing out problems with data formatting, or a message informing that the datasets appear to be correctly formatted.

Examples

data(minimalSampData)
data(minimalCompData)
chemoDivCheck(minimalSampData, minimalCompData) # Correct format
chemoDivCheck(minimalSampData, minimalCompData[c(2,3,1),]) # Incorrect format

data(alpinaSampData)
data(alpinaCompData)
chemoDivCheck(sampleData = alpinaSampData, compoundData = alpinaCompData)

[Package chemodiv version 0.3.0 Index]