mat_scramble {gscramble} | R Documentation |
Scramble a matrix of genotype data
Description
This function assumes that M is a matrix with L rows (number of markers) and
2 * N (N = number of individuals) columns.
There are two ways that the data might be permuted. In the first,
obtained with preserve_haplotypes = FALSE
,
the position of missing data within the matrix is held constant, but all
non-missing sites within a row (i.e. all gene copies at a locus) get
scrambled amongst the samples. In the second way, just the columns are
permuted. This preserves haplotypes in the data, if there are any.
The second approach should only be used if haplotypes are inferred in
the individuals.
Usage
mat_scramble(
M,
preserve_haplotypes = FALSE,
row_groups = NULL,
preserve_individuals = FALSE
)
Arguments
M |
a matrix with L rows (number of markers) and 2 * N columns where N is the number of individuals. Missing data must be coded as NA |
preserve_haplotypes |
logical indicating whether the haplotypes set to be TRUE |
row_groups |
if not NULL must be a list of indexes of adjacent rows
that are all in the same groups. For example: |
preserve_individuals |
logical indicating whether the genes within each individual should stay togeter. |
Details
There is now an additional way of permuting: if
preserve_individuals = TRUE
, then entire individuals are permuted.
If preserve_haplotypes = FALSE
, then the gene copies at each locus
are randomly ordered within each individual before permuating them.
If preserve_haplotypes = TRUE
then that initial permutation is not
done. This should only be done if the individuals are phased and that
phasing is represented in how the genotypes are stored in the matrix.
Value
This function returns a matrix of the same dimensions and storage.mode
as the input, M
; however the elements have been permuted according to the
options specified by the users.
Examples
# make a matrix with alleles named as I.M.g, where I is individual
# number, M is marker number, and g is either "a" or "b" depending
# on which gene copy in the diploid it is. 4 indivs and 7 markers...
Mat <- matrix(
paste(
rep(1:4, each = 7 * 2),
rep(1:7, 4 * 2),
rep(c("a", "b"), each = 7),
sep = "."
),
nrow = 7
)
# without preserving haplotypes
S1 <- mat_scramble(Mat)
# preserving haplotypes with markers 1-7 all on one chromosome
S2 <- mat_scramble(Mat, preserve_haplotypes = TRUE)
# preserving haplotypes with markers 1-3 on one chromosome and 4-7 on another
S3 <- mat_scramble(Mat, row_groups = list(1:3, 4:7))
# preserving individuals, but not haplotypes, with two chromosomes
S4 <- mat_scramble(Mat, row_groups = list(1:3, 4:7), preserve_individuals = TRUE)
# preserving individuals by chromosome, but not haplotypes, with two chromosomes
S5 <- mat_scramble(Mat, row_groups = list(1:3, 4:7), preserve_individuals = "BY_CHROM")
# preserving individuals by chromosome, and preserving haplotypes, with two chromosomes
S6 <- mat_scramble(Mat, row_groups = list(1:3, 4:7),
preserve_individuals = "BY_CHROM", preserve_haplotypes = TRUE)