contax.trim {microcontax} | R Documentation |
The ConTax data set
Description
The trimmed version of the ConTax data set.
Usage
data(contax.trim)
Details
contax.trim
is a data.frame
object containing 38 781 full-length 16S rRNA
sequences. It is the trimmed version of the full data set (see below). Large taxa (many sequences) have
been trimmed as described in Vinje et al. (2016) to obtain a data set with a more even representation of
the prokaryotic taxonomy.
The contax.full
is the full consensus taxonomy data set as described in Vinje et al. (2016). The data
set is too large for CRAN and thus available as a separate package microcontax.data
. See example
below for how to obtain contax.full
.
The Header of every sequence starts with a unique tag, in this case the text "ConTax" and some integer. This is followed by a token describing the origin of the sequence. It is typically
"Intersection=SRG"
meaning it is found in both the Silva, RDP and Greengenes data repository. Intersections can also be SR, SG and RG if the sequence was found in two repositories only. The taxonomy information for each sequence is found in the third token. It follows a commonly used format:
"k__<...>;p__<...>;c__<...>;o__<...>;f__<...>;g__<...>;"
where <...> is some proper text. The letters, followed by a double underscore, refer to the taxonomic levels Domain (Kingdom), Phylum, Class, Order, Family and Genus. Here is an example of a proper string:
"k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Staphylococcaceae;g__Staphylococcus;"
As long as this format is used the taxonomy information can be extracted by the supplied
extractor-functions getDomain
, getPhylum
,...,getGenus
.
Author(s)
Hilde Vinje, Kristian Hovde Liland, Lars Snipen.
See Also
medoids
, getDomain
, contax.full
.
Examples
data(contax.trim)
dim(contax.trim)
# Write to FASTA-file
## Not run:
writeFasta(contax.trim,out.file="ConTax_trim.fasta")
# Install microcontax.data with the BIG contax.full data set
if (!requireNamespace("microcontax.data", quietly = TRUE)) {
install.packages("microcontax.data")
}
# Load data
data("contax.full", package = "microcontax.data")
## End(Not run)