genotypes {corehunter} | R Documentation |
Create Core Hunter genotype data from data frame, matrix or file.
Description
Specify either a data frame or matrix, or a file from which to read the genotypes. See https://www.corehunter.org for documentation and examples of the genotype data file format used by Core Hunter.
Usage
genotypes(data, alleles, file, format)
Arguments
data |
Data frame or matrix containing the genotypes (individuals x markers) depending on the chosen format:
In case a data frame is provided, an optional first column |
alleles |
Allele names per marker ( |
file |
File containing the genotype data. |
format |
Genotype data format, one of |
Value
Genotype data of class chgeno
with elements
data
Genotypes. Data frame for default format,
numeric
matrix for other formats.size
Number of individuals in the dataset.
ids
Unique item identifiers (
character
).names
Item names (
character
). Names of individuals to which no explicit name has been assigned are equal to the uniqueids
.markers
Marker names (
character
). May containNA
values in case only some or no marker names were specified. Marker names are always included for thedefault
andfrequency
format but are optional for thebiparental
format.alleles
List of character vectors with allele names per marker. Vectors may contain
NA
values in case only some or no allele names were specified. Forbiparental
data the two alleles are name"0"
and"1"
, respectively, for all markers. For thedefault
format allele names are inferred from the provided data. Finally, forfrequency
data allele names are optional and may be specified either in the file or through thealleles
argument when creating this type of data from a matrix or data frame.java
Java version of the data object.
format
Genotype data format used.
file
Normalized path of file from which data was read (if applicable).
Examples
## Not run:
# create from data frame or matrix
# default format
geno.data <- data.frame(
NAME = c("Alice", "Bob", "Carol", "Dave", "Eve"),
M1.1 = c(1,2,1,2,1),
M1.2 = c(3,2,2,3,1),
M2.1 = c("B","C","D","B",NA),
M2.2 = c("B","A","D","B",NA),
M3.1 = c("a1","a1","a2","a2","a1"),
M3.2 = c("a1","a2","a2","a1","a1"),
M4.1 = c(NA,"+","+","+","-"),
M4.2 = c(NA,"-","+","-","-"),
row.names = paste("g", 1:5, sep = "-")
)
geno <- genotypes(geno.data, format = "default")
# biparental (e.g. SNP)
geno.data <- matrix(
sample(c(0,1,2), replace = TRUE, size = 1000),
nrow = 10, ncol = 100
)
rownames(geno.data) <- paste("g", 1:10, sep = "-")
colnames(geno.data) <- paste("m", 1:100, sep = "-")
geno <- genotypes(geno.data, format = "biparental")
# frequencies
geno.data <- matrix(
c(0.0, 0.3, 0.7, 0.5, 0.5, 0.0, 1.0,
0.4, 0.0, 0.6, 0.1, 0.9, 0.0, 1.0,
0.3, 0.3, 0.4, 1.0, 0.0, 0.6, 0.4),
byrow = TRUE, nrow = 3, ncol = 7
)
rownames(geno.data) <- paste("g", 1:3, sep = "-")
colnames(geno.data) <- c("M1", "M1", "M1", "M2", "M2", "M3", "M3")
alleles <- c("M1-a", "M1-b", "M1-c", "M2-a", "M2-b", "M3-a", "M3-b")
geno <- genotypes(geno.data, alleles, format = "frequency")
# read from file
# default format
geno.file <- system.file("extdata", "genotypes.csv", package = "corehunter")
geno <- genotypes(file = geno.file, format = "default")
# biparental (e.g. SNP)
geno.file <- system.file("extdata", "genotypes-biparental.csv", package = "corehunter")
geno <- genotypes(file = geno.file, format = "biparental")
# frequencies
geno.file <- system.file("extdata", "genotypes-frequency.csv", package = "corehunter")
geno <- genotypes(file = geno.file, format = "frequency")
## End(Not run)