R: TRAMPknowns Objects

TRAMPknowns {TRAMPR}

R Documentation

TRAMPknowns Objects

Description

These functions create and interact with TRAMPknowns objects (collections of known TRFLP patterns). Knowns contrast with “samples” (see TRAMPsamples) in that knowns contain identified profiles, while samples contain unidentified profiles. Knows must have at most one peak per enzyme/primer combination (see Details).

Usage

TRAMPknowns(data, info, cluster.pars=list(), file.pat=NULL,
            warn.factors=TRUE, ...)


## S3 method for class 'TRAMPknowns'
labels(object, ...)
## S3 method for class 'TRAMPknowns'
summary(object, include.info=FALSE, ...)

Arguments

`data`	data.frame containing peak information.
`info`	data.frame, describing individual samples (see Details for definitions of both data.frames).
`cluster.pars`	Parameters used when clustering the knowns database. See Details.
`file.pat`	Optional partial filename in which to store knowns database after modification. Files `<file.pat>_info.csv` and `<file.pat>_data.csv` will be created.
`warn.factors`	Logical: Should a warning be given if any columns in `info` or `data` are converted into factors?
`object`	A `TRAMPknowns` object.
`include.info`	Logical: Should the output be augmented with the contents of the `info` component of the `TRAMPknowns` object?
`...`	`TRAMPknowns`: Additional objects to incorportate into a `TRAMPknowns` object. Other methods: Further arguments passed to or from other methods.

Details

The object has at least two components, which relate to each other (in the sense of a relational database). info holds information about the individual samples, and data holds information about individual peaks (many of which may belong to a single sample).

Column definitions:

info:

knowns.pk:
Unique positive integer, used to identify individual knowns (i.e. a “primary key”).

species:
Character, giving species name.
data:

knowns.fk:
Positive integer, indicating which sample the peak belongs to (by matching against info$knowns.pk) (i.e. a “foreign key”).

primer:
Character, giving the name of the primer used.

enzyme:
Character, giving the name of the restriction digest enzyme used.

size:
Numeric, giving size (in base pairs) of the peak.

In addition, TRAMPknowns will create additional columns holding clustering information (see group.knowns). Additional columns are allowed (and retained, but ignored) in both data.frames. Additional objects are allowed as part of the TRAMPknowns object, but these will not be written by write.TRAMPknowns; any extra objects passed (via ...) will be included in the final TRAMPknowns object.

The cluster.pars argument controls how knowns will be clustered (this will happen automatically as needed). Elements of the list cluster.pars may be any of the three arguments to group.knowns, and will be used as defaults in subsequent calls to group.knowns. If not provided, default values are: dist.method="maximum", hclust.method="complete", cut.height=2.5 (if only some elements of cluster.pars are provided, the remaining elements default to the values above). To change values of clustering parameters in an existing TRAMPknowns object, use group.knowns.

A known contains at most one peak per enzyme/primer combination. Where a species is known to have multiple TRFLP profiles, these should be treated as separate knowns with different, unique, knowns.pk values, but with identical species values. A sample containing either pattern will then be recorded as having that species present (see group.knowns).

Value

`TRAMPknowns`	A new `TRAMPknowns` object: a list with components `info`, `data` (the provided data.frames, with clustering information added to `info`), `cluster.pars` and `file.pat`, plus any extra objects passed as `...`.
`labels.TRAMPknowns`	A sorted vector of the unique samples present in `x` (from `info$knowns.pk`).
`summary.TRAMPknowns`	A data.frame, with the size of the peak (if present) for each enzyme/primer combination, with each known (indicated by `knowns.pk`) as rows and each combination (in the format `<primer>_<enzyme>`) as columns.

Note

Across a TRAMPknowns object, primer and enzyme names must be exactly the same (including case and whitespace) to be considered the same. For example "ITS4", "Its4", "ITS 4" and "ITS4 " would be considered to be four different primers.

Factors will not merge correctly (with combine.TRAMPknowns or add.known). TRAMPknowns will attempt to catch factor columns and convert them into characters for the info and data data.frames. Other objects (passed as part of ...) will not be altered.

Examples

## This example builds a TRAMPknowns object from completely artificial
## data:

## The info data.frame:
knowns.info <-
  data.frame(knowns.pk=1:8,
             species=rep(paste("Species", letters[1:5]), length=8))
knowns.info

## The data data.frame:
knowns.data <- expand.grid(knowns.fk=1:8,
                           primer=c("ITS1F", "ITS4"),
                           enzyme=c("BsuRI", "HpyCH4IV"))
knowns.data$size <- runif(nrow(knowns.data), min=40, max=800)

## Construct the TRAMPknowns object:
demo.knowns <- TRAMPknowns(knowns.data, knowns.info, warn.factors=FALSE)

## A plot of the pretend knowns:
plot(demo.knowns, cex=1, group.clusters=TRUE)

[Package TRAMPR version 1.0-10 Index]