R: TRFLP Analysis and Matching Program

TRAMP {TRAMPR}

R Documentation

TRFLP Analysis and Matching Program

Description

Determine if TRFLP profiles may match those in a database of knowns. The resulting object can be used to produce a presence/absence matrix of known profiles in environmental samples.

The TRAMPR package contains a vignette, which includes a worked example; type vignette("TRAMPRdemo") to view it.

Usage

TRAMP(samples, knowns, accept.error=1.5, min.comb=4, method="maximum")

Arguments

`samples`	A `TRAMPsamples` object, containing unidentified samples.
`knowns`	A `TRAMPknowns` object, containing identified TRFLP patterns.
`accept.error`	The largest acceptable difference (in base pairs) between any peak in the sample data and the knowns database (see Details; interpretation will depend on the value of `method`).
`min.comb`	Minimum number of enzyme/primer combinations required before presence will be tested. The default (4) should be reasonable in most cases. Setting `min.comb` to `NA` will require that all enzyme/primer combinations in the knowns database are present in the samples.
`method`	Method used in calculating the difference between samples and knowns; may be one of `"maximum"`, `"euclidian"` or `"manhattan"` (or any unambiguous abbreviation).

Details

TRAMP attempts to determine which species in the ‘knowns’ database may be present in a collection of samples.

A sample matches a known if it has a peak that is “close enough” to every peak in the known for every enzyme/primer combination that they share. The default is to accept matches where the largest distance between a peak in the knowns database and the sample is less than accept.error base pairs (default 2), and where at least min.comb enzyme/primer combinations are shared between a sample and a known (default 4).

The three-dimensional matrix of match errors is generated by create.diffsmatrix. In the resulting array, m[i,j,k] is the difference (in base pairs) between the ith sample and the jth known for the kth enzyme/primer combination.

If p_k and q_k are the sizes of peaks for the kth enzyme/primer combination for a sample and known (respectively), then maximum distance is defined as

\max(|p_k - q_k|)

Euclidian distance is defined as

\frac{1}{n}\sqrt{\sum (p_k - q_k)^2}

and Manhattan distance is defined as

\frac{1}{n}\sum{|p_k - q_k|}

where n is the number of shared enzyme/primer combinations, since this may vary across sample/known combinations. For Euclidian and Manhattan distances, accept.error then becomes the mean distance, rather than the total distance.

Value

A TRAMP object, with elements:

`presence`	Presence/absence matrix. Rows are different samples (with rownames from `labels(samples)`) and columns are different knowns (with colnames from `labels(knowns)`). Do not access the presence/absence matrix directly, but use `summary.TRAMP`, which provides options for labelling knowns, grouping knowns, and excluding “ignored” matches.
`error`	Matrix of distances between the samples and known, calculated by one of the methods described above. Rows correspond to different samples, and columns correspond to different knowns. The matrix dimension names are set to the values `sample.pk` and `knowns.pk` for the samples and knowns, respectively.
`n`	A two-dimensional matrix (same dimensions as `error`), recording the number of enzyme/primer combinations present for each combination of samples and knowns.
`diffsmatrix`	Three-dimensional array of output from `create.diffsmatrix`.
`enzyme.primer`	Different enzyme/primer combinations present in the data, in the order of the third dimension of `diffsmatrix` (see `create.diffsmatrix` for details).
`samples`, `knowns`, `accept.error`, `min.comb`, `method`	The input data objects and arguments, unmodified.

In addition, an element presence.ign is included to allow matches to be ignored. However, this interface is experimental and its current format should not be relied on - use remove.TRAMP.match rather than interacting directly with presence.ign.

Matching is based only on peak size (in base pairs), and does not consider peak heights.

Examples

data(demo.knowns)
data(demo.samples)

res <- TRAMP(demo.samples, demo.knowns)

## The resulting object can be interrogated with methods:

## The goodness of fit of the sample with sample.pk=101 (see
## ?\link{plot.TRAMP}).
plot(res, 101)

## Not run: 
## To see all plots (this produces many figures), one after another.
op <- par(ask=TRUE)
plot(res)
par(op)

## End(Not run)

## Produce a presence/absence matrix (see ?\link{summary.TRAMP}).
m <- summary(res)
head(m)

[Package TRAMPR version 1.0-10 Index]