fixed_loci {bnpsd}R Documentation

Identify fixed loci

Description

A locus is "fixed" if the non-missing sub-vector contains all 0's or all 2's (the locus is completely homozygous for one allele or completely homozygous for the other allele). This function tests each locus, returning a vector that is TRUE for each fixed locus, FALSE otherwise. Loci with only missing elements (NA) are treated as fixed. The parameter maf_min extends the "fixed" definition to loci whose minor allele frequency is smaller or equal than this value. Below m is the number of loci, and n is the number of individuals.

Usage

fixed_loci(X, maf_min = 0)

Arguments

X

The m-by-n genotype matrix

maf_min

The minimum minor allele frequency (default zero), to extend the working definition of "fixed" to include rare variants. Loci with minor allele frequencies less than or equal to this value are marked as fixed. Must be a scalar between 0 and 0.5.

Value

A length-m boolean vector where the i element is TRUE if locus i is fixed or completely missing, FALSE otherwise. If X had row names, they are copied to the names of this output vector.

Examples

# here's a toy genotype matrix
X <- matrix(
       data = c(
              2, 2, NA,  # fixed locus (with one missing element)
              0, NA, 0,  # another fixed locus, for opposite allele
              1, 1, 1,   # NOT fixed (heterozygotes are not considered fixed)
              0, 1, 2,   # a completely variable locus
              0, 0, 1,   # a somewhat "rare" variant
              NA, NA, NA # completely missing locus (will be treated as fixed)
             ),
       ncol = 3, byrow = TRUE)

# test that we get the desired values
stopifnot(
  fixed_loci(X) == c(TRUE, TRUE, FALSE, FALSE, FALSE, TRUE)
)

# the "rare" variant gets marked as "fixed" if we set `maf_min` to its frequency
stopifnot(
  fixed_loci(X, maf_min = 1/6) == c(TRUE, TRUE, FALSE, FALSE, TRUE, TRUE)
)


[Package bnpsd version 1.3.13 Index]