amUnique {allelematch} | R Documentation |
Identification of unique genotypes
Description
Identifies unique genotypes and generates analysis output in formatted text, HTML, or
CSV. Samples are clustered and matched based on their dissimilarity score (see
amMatrix
). Also calculated is the match probability, Psib, which is the
probability that a sample is a sibling of a unique genotype (and therefore not a
replicate sample) given the allele frequencies in a population consisting of only the
unique genotypes (Wilberg & Dreher, 2004).
Usage
amUnique(
amDatasetFocal,
multilocusMap = NULL,
alleleMismatch = NULL,
matchThreshold = NULL,
cutHeight = NULL,
doPsib = "missing",
consensusMethod = 1,
verbose = TRUE
)
amHTML.amUnique(
x,
htmlFile = NULL,
htmlCSS = amCSSForHTML()
)
amCSV.amUnique(
x,
csvFile,
uniqueOnly = FALSE
)
## S3 method for class 'amUnique'
summary(
object,
html = NULL,
csv = NULL,
...
)
Arguments
amDatasetFocal |
An |
multilocusMap |
Optional. |
alleleMismatch |
Optional. |
matchThreshold |
Optional. |
cutHeight |
Optional. |
doPsib |
String specifying how match probability should be calculated. |
consensusMethod |
The method (an integer) used to determine the consensus multilocus genotype from a
cluster of multilocus genotypes. |
verbose |
If |
object , x |
An |
htmlFile |
HTML filepath to create. |
htmlCSS |
A string containing a valid cascading style sheet. |
html |
If |
csvFile , csv |
CSV filepath to create containing a representation of the |
uniqueOnly |
If |
... |
Additional arguments to |
Details
Only one of alleleMismatch
, cutHeight
, matchThreshold
can be
specified, as the three parameters are related.
alleleMismatch
is the most intuitive way to understand how the identification
of unique genotypes proceeds. For example, a setting of alleleMismatch = 4
implies that up to four alleles may be different for multiple samples to be
representatives of the same individual. In practice, however, this value is only an
approximation of the amount of mismatch that may be tolerated. This is because the
clustering process used to identify unique genotypes, and the subsequent matching
which identifies samples that match these unique genotypes is based on a dissimilarity
metric or score (see amMatrix
) that incorporates both allele mismatches
and missing data. alleleMismatch
is not used in analyses and is converted to
this dissimilarity metric in the following manner: cutHeight
which is parameter
for amCluster
and called from this function is cutHeight =
alleleMismatch/(number of allele columns)
and matchThreshold
which is a
parameter for amPairwise
and also called from this function is
matchThreshold = 1 - cutHeight
.
Selecting the appropriate value for alleleMismatch
, cutHeight
, or
matchThreshold
is an important task. Use amUniqueProfile
to
assist in this process. Seethe Data S1 Supplementary documentation and tutorials (PDF)
located at <doi:10.1111/j.1755-0998.2012.03137.x>
doPsib = "missing"
is the default and specifies that match probability Psib
should be calculated for samples that match unique genotypes and have no allele
mismatches, but may differ by having missing data. doPsib = "all"
specifies
that Psib should be calculated for all samples that match unique genotypes. In this
case, if allele mismatches occur, alleles are assumed to be missing at the mismatching
loci.
multilocusMap
is often not required, as amDataset objects will typically
consist of paired columns of genotypes, where each pair is a separate locus. In cases
where this is not the case (e.g., gender is in only one column), a map vector must be
specified.
Example: amDataset
consists of gender followed by 4 diploid loci in paired
columns
multilocusMap = c(1, 2, 2, 3, 3, 4, 4, 5, 5)
or equally
multilocusMap=c("GENDER", "LOC1", "LOC1", "LOC2", "LOC2", "LOC3", "LOC4",
"LOC4")
For more information on selecting consensusMethod
see amCluster
.
The default consensusMethod = 1
is typically adequate.
Value
amUnique
object or side effects: analysis summary written to an HTML file or to
the console, or written to a CSV file.
Note
There is an additional side effect of html = TRUE
(or of htmlFile =
NULL
). If required, there is a clean up of the operating system temporary directory
where AlleleMatch temporary HTML files are stored. Files that match the pattern
am*.html and are older 24 hours are deleted from this temporary directory.
Author(s)
Paul Galpern (pgalpern@gmail.com)
References
For a complete vignette, please access via the Data S1 Supplementary documentation and
tutorials (PDF) located at <doi:10.1111/j.1755-0998.2012.03137.x>.
Wilberg MJ, Dreher BP (2004) GENECAP: a program for analysis of multilocus genotype data for non-invasive sampling and capture-recapture population estimation. Molecular Ecology Notes, 4, 783-785.
See Also
amDataset
, amMatrix
, amPairwise
,
amCluster
, amUniqueProfile
Examples
## Not run:
data("amExample2")
## Produce amDataset object
myDataset <-
amDataset(
amExample2,
missingCode = "-99",
indexColumn = 1,
ignoreColumn = 2
)
## Usage
## Optimal alleleMismatch parameter previously found using amUniqueProfile()
myUnique <-
amUnique(
myDataset,
alleleMismatch = 3
)
## Display analysis as HTML in default browser
summary.amUnique(
myUnique,
html = TRUE
)
## Save analysis to HTML file
summary.amUnique(
myUnique,
html = "myUnique.htm"
)
## Save analysis to a CSV file
summary.amUnique(
myUnique,
csv = "myUnique.csv"
)
## Save unique genotypes only to a CSV file
summary.amUnique(
myUnique,
csv = "myUnique.csv",
uniqueOnly = TRUE
)
## Data set with gender information
data("amExample5")
## Produce amDataset object
myDataset2 <-
amDataset(
amExample5,
missingCode = "-99",
indexColumn = 1,
metaDataColumn = 2
)
## Usage
## Optimal alleleMismatch parameter previously found using amUniqueProfile()
myUniqueProfile <-
amUnique(
myDataset2,
multilocusMap = c(1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10,
11, 11),
alleleMismatch = 3
)
## End(Not run)