R: Model And Plot Drop-out Events

modelDropout_gui {strvalidator}

R Documentation

Model And Plot Drop-out Events

Description

Model the probability of drop-out and plot graphs.

Usage

modelDropout_gui(
  env = parent.frame(),
  savegui = NULL,
  debug = FALSE,
  parent = NULL
)

Arguments

`env`	environment in which to search for data frames and save result.
`savegui`	logical indicating if GUI settings should be saved in the environment.
`debug`	logical indicating printing debug information.
`parent`	widget to get focus when finished.

Details

calculateDropout score drop-out events relative to a user defined LDT in four different ways: (1) by reference to the low molecular weight allele (Method1), (2) by reference to the high molecular weight allele (Method2), (3) by reference to a random allele (MethodX), and (4) by reference to the locus (MethodL). Options 1-3 are recommended by the DNA commission (see reference), while option 4 is included for experimental purposes. Options 1-3 may discard many dropout events while option 4 catches all drop-out events. On the other hand options 1-3 can score events below the LDT, while option 4 cannot, making accurate predictions possible below the LDT. This is also why the number of observed drop-out events may differ between model plots and heatmap, scatterplot, and ecdf.

Method X/1/2 records the peak height of the partner allele to be used as the explanatory variable in the logistic regression. The locus method L also do this when there has been a drop-out, if not the the mean peak height for the locus is used. Peak heights for the locus method are stored in a separate column.

Using the scored drop-out events and the peak heights of the surviving alleles the probability of drop-out can be modeled by logistic regression as described in Appendix B in reference [1]. P(dropout|H) = B0 + B1*H, where 'H' is the peak height or log(peak height). This produces a plot with the predicted probabilities for a range of peak heights. There are options to print the model parameters, mark the stochastic threshold at a specified probability of drop-out, include the underlying observations, and to calculate a specified prediction interval. A conservative estimate of the stochastic threshold can be calculated from the prediction interval: the risk of observing a drop-out probability greater than the specified threshold limit, at the conservative peak height, is less than a specified value (e.g. 1-0.95=0.05). By default the gender marker is excluded from the dataset used for modeling, and the peak height is used as explanatory variable. The logarithm of the average peak height 'H' can be used instead of the allele/locus peak height [3] (The implementation of 'H' has limitations when dropout is present. See calculateHeight). To evaluate the goodness of fit for the logistic regression the Hosmer-Lemeshow test is used [4]. A value below 0.05 indicates a poor fit. Alternatives to the logistic regression method are discussed in reference [5] and [6].

Explanation of the result: Dropout - all alleles are scored according to the limit of detection threshold (LDT). This is the observations and is not used for modeling. Rfu - peak height of the surviving allele. MethodX - a random reference allele is selected and drop-out is scored in relation to the the partner allele. Method1 - the low molecular weight allele is selected and drop-out is scored if the high molecular weight allele is missing. Method2 - the high molecular weight allele is selected and drop-out is scored if the low molecular weight allele is missing. MethodL - drop-out is scored per locus i.e. drop-out if any allele is missing. MethodL.Ph - peak height of the surviving allele if one allele has dropped out, or the average peak height if no drop-out.

Value

TRUE

References

[1] Peter Gill et.al., DNA commission of the International Society of Forensic Genetics: Recommendations on the evaluation of STR typing results that may include drop-out and/or drop-in using probabilistic methods, Forensic Science International: Genetics, Volume 6, Issue 6, December 2012, Pages 679-688, ISSN 1872-4973, 10.1016/j.fsigen.2012.06.002. doi:10.1016/j.fsigen.2012.06.002

[2] Peter Gill, Roberto Puch-Solis, James Curran, The low-template-DNA (stochastic) threshold-Its determination relative to risk analysis for national DNA databases, Forensic Science International: Genetics, Volume 3, Issue 2, March 2009, Pages 104-111, ISSN 1872-4973, 10.1016/j.fsigen.2008.11.009. doi:10.1016/j.fsigen.2008.11.009

[3] Torben Tvedebrink, Poul Svante Eriksen, Helle Smidt Mogensen, Niels Morling, Estimating the probability of allelic drop-out of STR alleles in forensic genetics, Forensic Science International: Genetics, Volume 3, Issue 4, September 2009, Pages 222-226, ISSN 1872-4973, 10.1016/j.fsigen.2009.02.002. doi:10.1016/j.fsigen.2009.02.002

[4] H. DW Jr., S. Lemeshow, Applied Logistic Regression, John Wiley & Sons, 2004.

[5] A.A. Westen, L.J.W. Grol, J. Harteveld, A.S. Matai, P. de Knijff, T. Sijen, Assessment of the stochastic threshold, back- and forward stutter filters and low template techniques for NGM, Forensic Science International: Genetetics, Volume 6, Issue 6 December 2012, Pages 708-715, ISSN 1872-4973, 10.1016/j.fsigen.2012.05.001. doi:10.1016/j.fsigen.2012.05.001

[6] R. Puch-Solis, A.J. Kirkham, P. Gill, J. Read, S. Watson, D. Drew, Practical determination of the low template DNA threshold, Forensic Science International: Genetetics, Volume 5, Issue 5, November 2011, Pages 422-427, ISSN 1872-4973, 10.1016/j.fsigen.2010.09.001. doi:10.1016/j.fsigen.2010.09.001