snp.recode {ASRgenomics} | R Documentation |
Recodes the molecular matrix M for downstream analyses
Description
Reads molecular data in format of bi-allelic nucleotide bases (AA,
AG, GG, CC, etc.) and recodes them as 0, 1, 2 and NA
to be used in other
downstream analyses.
Usage
snp.recode(
M = NULL,
map = NULL,
marker = NULL,
ref = NULL,
alt = NULL,
recoding = c("ATGCto012"),
na.string = NA,
rename.markers = TRUE,
message = TRUE
)
Arguments
M |
A character matrix with SNP data of full form ( |
map |
(Optional) A data frame with the map information with |
marker |
A character indicating the name of the column in data frame |
ref |
A character indicating the name of the column in the map containing the reference allele for
recoding. If absent, then conversion will be based on the major allele (most frequent).
The marker information of a given individual with two of the specified major alleles
in |
alt |
A character indicating the name of the column in the map containing the alternative allele for
recoding. If absent, then it will be inferred from the data. The marker information of a given individual
with two of the specified alleles in |
recoding |
A character indicating the recoding option to be performed.
Currently, only the nucleotide bases (AA, AG, ...) to allele count is available ( |
na.string |
A character that is interpreted as missing values (default = |
rename.markers |
If |
message |
If |
Value
A list with the following two elements:
Mrecode
: the molecular matrix\boldsymbol{M}
recoded to 0, 1, 2 andNA
.mapr
: the data frame with the map information including reference allele and alternative allele.
Examples
# Create bi-allelic base data set.
Mnb <- matrix(c(
"A-", NA, "GG", "CC", "AT", "CC", "AA", "AA",
"AAA", NA, "GG", "AC", "AT", "CG", "AA", "AT",
"AA", NA, "GG", "CC", "AA", "CG", "AA", "AA",
"AA", NA, "GG", "AA", "AA", NA, "AA", "AA",
"AT", NA, "GG", "AA", "TT", "CC", "AT", "TT",
"AA", NA, NA, "CC", NA, "GG", "AA", "AA",
"AA", NA, NA, "CC", "TT", "CC", "AA", "AT",
"TT", NA, "GG", "AA", "AA", "CC", "AA", "AA"),
ncol = 8, byrow = TRUE, dimnames = list(paste0("ind", 1:8),
paste0("m", 1:8)))
Mnb
# Recode without map (but map is created).
Mr <- snp.recode(M = Mnb, na.string = NA)
Mr$Mrecode
Mr$map
# Create map.
mapnb <- data.frame(
marker = paste0("m", 1:8),
reference = c("A", "T", "G", "C", "T", "C", "A", "T"),
alternative = c("T", "G", "T", "A", "A", "G", "T", "A")
)
mapnb
# Recode with map without alternative allele.
Mr <- snp.recode(M = Mnb, map = mapnb, marker = "marker", ref = "reference",
na.string = NA, rename.markers = TRUE)
Mr$Mrecode
Mr$map
# Notice that the alternative allele is in the map as a regular variable,
# but in the names it is inferred from data (which might be 0 (missing)).
# Recode with map with alternative allele.
Mr <- snp.recode(M = Mnb, map = mapnb, marker = "marker",
ref = "reference", alt = "alternative",
na.string = NA, rename.markers = TRUE)
Mr$Mrecode
Mr$map # Now the alternative is also on the names.
# We can also recode without renaming the markers.
Mr <- snp.recode(M = Mnb, map = mapnb, marker = "marker", ref = "reference",
na.string = NA, rename.markers = FALSE)
Mr$Mrecode
Mr$map # Now the alternative is also on the names.