SubsetByLocus {polyRAD} | R Documentation |
Create RADdata Objects with a Subset of Loci
Description
These functions take a RADdata
object as input and generate smaller RADdata
objects containing only the specified loci. SubsetByLocus
allows the
user to specify which loci are kept, whereas SplitByChromosome
creates
multiple RADdata
objects representing chromosomes or sets of chromosomes.
RemoveMonomorphicLoci
eliminates any loci with fewer than two alleles.
RemoveHighDepthLoci
eliminates loci that have especially high read
depth in order to eliminate false loci originating from repetitive sequence.
RemoveUngenotypedLoci
is intended for datasets that have been run
through PipelineMapping2Parents
and may have some genotypes that
are missing or non-variable due to how priors were determined.
Usage
SubsetByLocus(object, ...)
## S3 method for class 'RADdata'
SubsetByLocus(object, loci, ...)
SplitByChromosome(object, ...)
## S3 method for class 'RADdata'
SplitByChromosome(object, chromlist = NULL, chromlist.use.regex = FALSE,
fileprefix = "splitRADdata", ...)
RemoveMonomorphicLoci(object, ...)
## S3 method for class 'RADdata'
RemoveMonomorphicLoci(object, verbose = TRUE, ...)
RemoveHighDepthLoci(object, ...)
## S3 method for class 'RADdata'
RemoveHighDepthLoci(object, max.SD.above.mean = 2, verbose = TRUE, ...)
RemoveUngenotypedLoci(object, ...)
## S3 method for class 'RADdata'
RemoveUngenotypedLoci(object, removeNonvariant = TRUE, ...)
Arguments
object |
A |
loci |
A character or numeric vector indicating which loci to include in the output
|
chromlist |
An optional list indicating how chromosomes should be split into separate
|
chromlist.use.regex |
If |
fileprefix |
A character string indicating the prefix of .RData files to export. |
max.SD.above.mean |
The maximum number of standard deviations above the mean read depth that a locus can be in order to be retained. |
verbose |
If |
removeNonvariant |
If |
... |
Additional arguments (none implemented). |
Details
SubsetByLocus
may be useful if the user has used their own filtering
criteria to determine a set of loci to retain, and wants to create a new
dataset with only those loci. It can be used at any point in the analysis
process.
SplitByChromosome
is intended to make large datasets more manageable
by breaking them into smaller datasets that can be processed independently,
either in parallel computing jobs on a cluster, or one after another on a
computer with limited RAM. Generally it should be used immediately after
data import. Rather than returning new RADdata
objects, it saves
them individually to separate workspace image files, which can than be
loaded one at a time to run analysis pipelines such as IteratePopStruct
.
GetWeightedMeanGenotypes
or one of the export functions can be
run on each resulting RADdata
object, and the resulting matrices
concatenated with cbind
.
SplitByChromosome
, RemoveMonomorphicLoci
, and
RemoveHighDepthLoci
use SubsetByLocus
internally.
Value
SubsetByLocus
, RemoveMonomorphicLoci
,
RemoveHighDepthLoci
, and RemoveUngenotypedLoci
return a RADdata
object with all the slots and attributes of object
, but only
containing the loci listed in loci
, only loci with two or more
alleles, only loci without abnormally high depth, or only loci where posterior
probabilities are non-missing and variable, respectively.
SplitByChromosome
returns a character vector containing file names
where .RData files have been saved. Each .RData file contains one
RADdata
object named splitRADdata
.
Author(s)
Lindsay V. Clark
See Also
Examples
# load a dataset for this example
data(exampleRAD)
exampleRAD
# just keep the first and fourth locus
subsetRAD <- SubsetByLocus(exampleRAD, c(1, 4))
subsetRAD
# split by groups of chromosomes
exampleRAD$locTable
tf <- tempfile()
splitfiles <- SplitByChromosome(exampleRAD, list(c(1, 4), c(6, 9)),
fileprefix = tf)
load(splitfiles[1])
splitRADdata
# filter out monomorphic loci (none removed in example)
filterRAD <- RemoveMonomorphicLoci(exampleRAD)
# filter out high depth loci (none removed in this example)
filterRAD2 <- RemoveHighDepthLoci(filterRAD)
# filter out loci with missing or non-variable genotypes
# (none removed in this example)
filterRAD3 <- IterateHWE(filterRAD2)
filterRAD3 <- RemoveUngenotypedLoci(filterRAD3)