QC.mppData {mppR} | R Documentation |
Quality control for mppData
objects
Description
Perform different operations of quality control (QC) on the marker data of an
mppData
object.
Usage
QC.mppData(
mppData,
mk.miss = 0.1,
gen.miss = 0.25,
n.lim = 15,
MAF.pop.lim = 0.05,
MAF.cr.lim = NULL,
MAF.cr.miss = TRUE,
MAF.cr.lim2 = NULL,
verbose = TRUE,
n.cores = 1
)
Arguments
mppData |
An object of class |
mk.miss |
|
gen.miss |
|
n.lim |
|
MAF.pop.lim |
|
MAF.cr.lim |
|
MAF.cr.miss |
|
MAF.cr.lim2 |
|
verbose |
|
n.cores |
|
Details
The different operations of the quality control are the following:
Remove markers with more than two alleles.
Remove markers that are monomorphic or fully missing in the parents.
Remove markers with a missing rate higher than
mk.miss
.Remove genotypes with more missing markers than
gen.miss
.Remove crosses with less than
n.lim
genotypes.Keep only the most polymorphic marker when multiple markers map at the same position.
Check marker minor allele frequency (MAF). Different strategy can be used to control marker MAF:
A) A first possibility is to filter marker based on MAF at the whole population level using
MAF.pop.lim
, and/or on MAF within crosses usingMAF.cr.lim
.The user can give the its own vector of critical values for MAF within cross using
MAF.cr.lim
. By default, the within cross MAF values are defined by the following function of the cross-size n.ci: MAF(n.ci) = 0.5 if n.ci c [0, 10] and MAF(n.ci) = (4.5/n.ci) + 0.05 if n.ci > 10. This means that up to 10 genotypes, the critical within cross MAF is set to 50 decreases when the number of genotype increases until 5If the within cross MAF is below the limit in at least one cross, then marker scores of the problematic cross are either put as missing (
MAF.cr.miss = TRUE
) or the whole marker is discarded (MAF.cr.miss = FALSE
). By default,MAF.cr.miss = TRUE
which allows to include a larger number of markers and to cover a wider genetic diversity.B) An alternative is to select only markers that segregate in at least on cross at the
MAF.cr.lim2
rate.
Value
a filtered mppData
object containing the the same elements
as create.mppData
after filtering. It contains also the
following new elements:
geno.id |
|
ped.mat |
Four columns |
geno.par.clu |
Parent marker matrix without monomorphic or completely missing markers. |
haplo.map |
Genetic map corresponding to the list of marker of the
|
parents |
List of parents. |
n.cr |
Number of crosses. |
n.par |
Number of parents. |
rem.mk |
Vector of markers that have been removed. |
rem.geno |
Vector of genotypes that have been removed. |
Author(s)
Vincent Garin
See Also
Examples
data(mppData_init)
mppData <- QC.mppData(mppData = mppData_init, n.lim = 15, MAF.pop.lim = 0.05,
MAF.cr.miss = TRUE, mk.miss = 0.1,
gen.miss = 0.25, verbose = TRUE)