getgenotypesdos {Mega2R} | R Documentation |
fetch dosage integer matrix for specified markers
Description
This function calls a C++ function that does all the heavy lifting. It passes the arguments necessary for the C++ function: some from the caller's arguments and some from data frames that are in the "global" environment, envir. From its markers_arg argument, it gets the locus_index and the index in the unified_genotype_table. From the "global" environment, envir, it gets a bit vector of compressed genotype information, and some bookkeeping related data. Note: This function also contains a dispatch/switch on the type of compression in the genotype vector. A different C++ function is called when there is compression versus when there is no compression.
Usage
getgenotypesdos(markers_arg, envir = ENV)
Arguments
markers_arg |
a data.frame with the following 5 observations:
|
envir |
an environment that contains all the data frames created from the SQLite database. |
Details
The unified_genotype_table contains one raw vector for each person. In the vector, there are two bits for each genotype. This function creates an output matrix by fixing the marker and collecting genotype information for each person and then repeating for all the specified markers.
Value
a list of 3 values, named "ncol", "zero", "geno".
- geno
is a matrix of dosages as integers. The value 0 is given to the Major allele value, 1 is given to the heterozygote value, and 2 is given to the Minor allele. In the matrix, there is usually one column for each marker in the markers_arg argument. But if there would be only the one allele 0 or 2 in the column, the column is ignorednot present. There is one row for each person in the family (fam) table.
- ncol
Is the count of the actual number of columns in the geno matrix.
- zero
Is a vector with one entry per marker. The value will be 0 if the marker is not in the geno matrix. Otherwise the value is the column number in the geno matrix where the marker data appears.
Examples
db = system.file("exdata", "seqsimm.db", package="Mega2R")
ENV = read.Mega2DB(db)
getgenotypesdos(ENV$markers[ENV$markers$chromosome == 1,])