R: Melting curve analysis with (iterative) Tm identification and...

meltcurve {qpcR}

R Documentation

Melting curve analysis with (iterative) Tm identification and peak area calculation/cutoff

Description

This function conducts a melting curve analysis from the melting curve data of a real-time qPCR instrument. The data has to be preformatted in a way that for each column of temperature values there exists a corresponding fluorescence value column. See edit(dyemelt) for a proper format. The output is a graph displaying the raw fluorescence curve (black), the first derivative curve (red) and the identified melting peaks. The original data together with the results (-\frac{\partial F}{\partial T} values, T_m values) are returned as a list. An automatic optimization procedure is also implemented which iterates over span.smooth and span.peaks values and finds the optimal parameter combination that delivers minimum residual sum-of-squares of the identified T_m values to known T_m values. For all peaks, the areas can be calculated and only those included which have areas higher than a given cutoff (cut.Area). If no peak was identified meeting the cutoff values, the melting curves are flagged with a 'bad' attribute. See 'Details'.

Usage

meltcurve(data, temps = NULL, fluos = NULL, window = NULL, 
          norm = FALSE, span.smooth = 0.05, span.peaks = 51, 
          is.deriv = FALSE, Tm.opt = NULL, Tm.border = c(1, 1), 
          plot = TRUE, peaklines = TRUE, calc.Area = TRUE, 
          plot.Area = TRUE, cut.Area = 0,...)

Arguments

`data`	a dataframe containing the temperature and fluorescence data.
`temps`	a vector of column numbers reflecting the temperature values. If `NULL`, they are assumed to be 1, 3, 5, ... .
`fluos`	a vector of column numbers reflecting the fluorescence values. If `NULL`, they are assumed to be 2, 4, 6, ... .
`window`	a user-defined window for the temperature region to be analyzed. See 'Details'.
`norm`	logical. If `TRUE`, the fluorescence values are scaled between [0, 1].
`span.smooth`	the window span for curve smoothing. Can be tweaked to optimize `T_m` identification.
`span.peaks`	the window span for peak identification. Can be tweaked to optimize `T_m` identification. Must be an odd number.
`is.deriv`	logical. Use `TRUE`, if `data` is already in first derivative transformed format.
`Tm.opt`	a possible vector of known `T_m` values to optimize `span.smooth` and `span.peaks` against. See 'Details' and 'Examples'.
`Tm.border`	for peak area calculation, a vector containing left and right border temperature values from the `T_m` values. Default is -1/+1 ?C.
`plot`	logical. If `TRUE`, a plot with the raw melting curve, derivative curve and identified `T_m` values is displayed for each sample.
`peaklines`	logical. If `TRUE`, lines that show the identified peaks are plotted.
`calc.Area`	logical. If `TRUE`, all peak areas are calculated.
`plot.Area`	logical. If `TRUE`, the baselined area identified for the peaks is plotted by filling the peaks in red.
`cut.Area`	a peak area value to identify only those peaks with a higher area.
`...`	other parameters to be passed to `plot`.

Details

The melting curve analysis is conducted with the following steps:

1a) Temperature and fluorescence values are selected in a region according to window.
1b) If norm = TRUE, the fluorescence data is scaled into [0, 1] by qpcR:::rescale.
Then, the function qpcR:::TmFind conducts the following steps:
2a) A cubic spline function (splinefun) is fit to the raw fluorescence melt values.
2b) The first derivative values are calculated from the spline function for each of the temperature values.
2c) Friedman's supersmoother (supsmu) is applied to the first derivative values.
2d) Melting peaks (T_m) values are identified by qpcR:::peaks.
2e) Raw melt data, first derivative data, best parameters, residual sum-of-squares and identified T_m values are returned.
Peak areas are then calculated by qpcR:::peakArea:
3a) A linear regression curve is fit from the leftmost temperature value (T_m - Tm.border[1]) to the rightmost temperature value (T_m + Tm.border[2]) by lm.
3b) A baseline curve is calculated from the regression coefficients by predict.lm.
3c) The baseline data is subtracted from the first derivative melt data (baselining).
3d) A splinefun is fit to the baselined data.
3e) The area of this spline function is integrated from the leftmost to rightmost temperature value.
4) If calculated peak areas were below cut.Area, the corresponding T_m values are removed.
Finally,
5) A matrix of xyy-plots is displayed using qpcR:::xyy.plot.

is.deriv must be set to TRUE if the exported data was already transformed to -\frac{\partial F}{\partial T} by the PCR system (i.e. Stratagene MX3000P).

If values are given to Tm.opt (see 'Examples'), then meltcurve is iterated over all combinations of span.smooth = seq(0, 0.2, by = 0.01) and span.peaks = seq(11, 201, by = 10). For each iteration, T_m values are calculated and compared to those given by measuring the residual sum-of-squares between the given values Tm.opt and the Tm values obtained during the iteration:

RSS = \sum_{i=1}^n{(Tm_i - Tm.opt_i)^2}

The returned list items containing the resulting data frame each has an attribute "quality" which is set to "bad" if none of the peaks met the cut.Area criterion (or "good" otherwise).

Value

A list with as many items as melting curves, named as in data, each containing a data.frame with the temperature (Temp), fluorescence values (Fluo), first derivative (dF.dT) values, (optimized) parameters of span.smooth/span.peaks, residual sum-of-squares (if Tm.opt != NULL), identified melting points (Tm), calculated peak areas (Area) and peak baseline values (baseline).

Note

The peaks function is derived from a R-Help mailing list entry in Nov 2005 by Martin Maechler.

Author(s)

Andrej-Nikolai Spiess

Examples

## Default columns.
data(dyemelt)
res1 <- meltcurve(dyemelt, window = c(75, 86))
res1

## Selected columns and normalized fluo values.
res2 <- meltcurve(dyemelt, temps = c(1, 3), fluos = c(2, 4), 
                  window = c(75, 86), norm = TRUE)  

## Removing peaks based on peak area
## => two peaks have smaller areas and are not included.
res3 <- meltcurve(dyemelt, temps = 1, fluos = 2, window = c(75, 86),  
                  cut.Area = 0.2) 
attr(res3[[1]], "quality")
                 
## If all peak areas do not meet the cutoff value, meltcurve is
## flagged as 'bad'.
res4 <- meltcurve(dyemelt, temps = 1, fluos = 2, window = c(75, 86),  
                  cut.Area = 0.5) 
attr(res4[[1]], "quality")

## Optimizing span and peaks values.
## Not run: 
res5 <- meltcurve(dyemelt[, 1:6], window = c(74, 88), 
                  Tm.opt = c(77.2, 80.1, 82.4, 84.8))

## End(Not run)

[Package qpcR version 1.4-1 Index]