amPairwise {allelematch} | R Documentation |
Pairwise matching of multilocus genotypes
Description
Functions to perform a pairwise matching analysis of a multilocus genotype dataset, and review
the output in formatted text or HTML. For each genotype in the focal dataset all genotypes in
the comparison genotype are returned that match at or above a threshold matching score. The
matching score is also known as the s-hat criterion (see the supplementary documentation). This
is determined using amMatrix
.
Usage
amPairwise(
amDatasetFocal,
amDatasetComparison = amDatasetFocal,
alleleMismatch = NULL,
matchThreshold = NULL,
missingMethod = 2
)
amHTML.amPairwise(
x,
htmlFile = NULL,
htmlCSS = amCSSForHTML()
)
amCSV.amPairwise(
x,
csvFile
)
## S3 method for class 'amPairwise'
summary(
object,
html = NULL,
csv = NULL,
...
)
Arguments
amDatasetFocal |
An |
amDatasetComparison |
Optional. |
alleleMismatch |
Maximum number of mismatching alleles which will be tolerated when identifying individuals;
also known as m-hat parameter. |
matchThreshold |
Return comparison genotypes that match with the focal genotype at or above this score or similarity; also known as s-hat parameter. |
missingMethod |
Method used to determine the similarity of multilocus genotypes when data is missing. |
object , x |
An |
htmlFile |
HTML filepath to create. |
htmlCSS |
A string containing a valid cascading style sheet. |
html |
If |
csvFile , csv |
CSV filepath to create containing giving a data frame representation of the pairwise matching results. |
... |
Additional arguments to |
Details
Pairwise matching of genotypes is a useful means to assess data quality and inspect for
genotyping errors.
matchThreshold
represents the similarity between two multilocus genotypes and can be
thought of as a percentage similarity (or a Hamming's distance between two vectors) that has
been corrected where missing data is present, such that missing data represents neither a match
nor a mismatch but a "partial" match. See amMatrix
for more discussion of this
metric.
Value
amPairwise
object or side effects: analysis summary written to an HTML file or to the
console, or written to a CSV file.
Note
As matchThreshold
is lowered, the size of the output increases rapidly. Typically
analyses will not be very useful or manageable with thresholds below 0.7.
There is an additional side effect of html = TRUE
(or of htmlFile = NULL
). If
required, there is a clean up of the operating system temporary directory where AlleleMatch
temporary HTML files are stored. Files that match the pattern am*.html and are older 24 hours
are deleted from this temporary directory.
Author(s)
Paul Galpern (pgalpern@gmail.com)
References
For a complete vignette, please access via the Data S1 Supplementary documentation and tutorials (PDF) located at <doi:10.1111/j.1755-0998.2012.03137.x>.
See Also
Examples
## Not run:
data("amExample5")
## Produce amDataset object
myDataset <-
amDataset(
amExample5,
missingCode = "-99",
indexColumn = 1,
metaDataColumn = 2,
ignoreColumn = "gender"
)
## Typical usage
myPairwise <-
amPairwise(
myDataset,
alleleMismatch = 2
)
## Display analysis as HTML in default browser
summary.amPairwise(
myPairwise,
html = TRUE
)
## Save analysis to HTML file
summary.amPairwise(
myPairwise,
html = "myPairwise.htm"
)
## Save analysis to CSV file
summary.amPairwise(
myPairwise,
csv = "myPairwise.csv"
)
## Display analysis as formatted text on the console
summary.amPairwise(myPairwise)
## Compare one dataset against a second
## Both must have same number of allele columns
## Here we create two datasets artificially from one for illustration purposes
myDatasetA <-
amDataset(
amExample5[sample(nrow(amExample5))[1:25], ],
missingCode = "-99",
indexColumn = 1,
ignoreColumn = 2
)
myDatasetB <-
amDataset(
amExample5[sample(nrow(amExample5))[1:100], ],
missingCode = "-99",
indexColumn = 1,
ignoreColumn = 2
)
myPairwise2 <-
amPairwise(
myDatasetA,
myDatasetB,
alleleMismatch = 3
)
summary.amPairwise(
myPairwise2,
html = TRUE
)
## End(Not run)