snpPos {Haplin} | R Documentation |
Find the column numbers of SNP identifiers/SNP numbers in a ped file
Description
Gives the column numbers of SNP identifiers or SNP numbers in a standard ped file, calculated from the SNP's positions in the corresponding map file. The column numbers are sorted in the same order as snp.select
.
These positions may be useful when extracting a selection of SNPs from a ped file.
Usage
snpPos(snp.select, map.file, blank.lines.skip = TRUE)
Arguments
snp.select |
A character vector of the SNP identifiers (RS codes) or a numeric vector of the SNP numbers. |
map.file |
A character string giving the name and path of the standard map file to be used. See Details for a description of the standard map format. |
blank.lines.skip |
Logical. If "TRUE" (default), |
Details
To extract certain SNPs from a standard ped file, one has to know their positions in the ped file.
This can be obtained from the corresponding map file.
The map file should look something like this:
Chromosome SNP-identifier Base-pair-position 1 RS9629043 554636 1 RS12565286 711153 1 RS12138618 740098
Alternatively, the map file could contain four columns. The column values should then be:
Chromosome, SNP-identifier, Genetic-distance, Base-pair-position.
A header must be added to the map file if this does not already exist.
The format of the corresponding ped file should be something like this:
1104 1104-1 1104-2 1104-3 1 2 4 1 3 2 1104 1104-2 0 0 1 1 4 1 2 2 1104 1104-3 0 0 2 1 0 0 0 0 1105 1105-1 1105-2 1105-3 2 2 1 1 2 2 1105 1105-2 0 0 1 1 1 1 2 2 1105 1105-3 0 0 2 1 1 1 3 2
The column values are: Family id, Individual id, Father's id, Mother's id, Sex (1 = male, 2 = female, alternatively: 1 = male, 0 = female), and Case-control status (1 = controls, 2 = cases, alternatively: 0 = controls, 1 = cases).
Column 7 and onwards contain the genotype data, with alleles in separate columns. A “0” is used to denote missing data.
Value
A vector of the column numbers of the SNP identifiers/SNP numbers in the ped file, sorted in the same order as given in snp.select
.
Note
The function does not check if the map file is formatted correctly or if the map and ped file have the same number of SNPs. The corresponding positions of the SNPs in the ped file may not be correct if the ped file has a different format from the given example.
Author(s)
Miriam Gjerdevik,
with Hakon K. Gjessing
Professor of Biostatistics
Division of Epidemiology
Norwegian Institute of Public Health
hakon.gjessing@uib.no
References
Web Site: https://haplin.bitbucket.io
See Also
Examples
## Not run:
# Find the column numbers of the SNP identifiers "RS9629043" and "RS12565286" in
# a standard ped file
snpPos(snp.select = c("RS9629043", "RS12565286"), map.file = "mygwas.map")
## End(Not run)