ghap.loadplink {GHap} | R Documentation |
Load binary PLINK data
Description
This function loads binary PLINK files (bed/bim/fam) and converts them into a native GHap.plink object.
Usage
ghap.loadplink(input.file = NULL, bed.file = NULL,
bim.file = NULL, fam.file = NULL,
ncores = 1, verbose = TRUE)
Arguments
If all input files share the same prefix, the user can use the following shortcut option:
input.file |
Prefix for input files. |
For backward compatibility, the user can still point to input files separately:
bed.file |
The binary genotype matrix (in SNP-major format). |
bim.file |
Variant map file. |
fam.file |
Pedigree (family) file. |
To turn loading progress-tracking on or off, or engage multiple cores, please use:
ncores |
A numerical value specfying the number of cores to use while loading the input files (default = 1). |
verbose |
A logical value specfying whether log messages should be printed (default = TRUE). |
Value
The returned GHap.plink object is a list with components:
nsamples |
An integer value for the sample size. |
nmarkers |
An integer value for the number of markers. |
nsamples.in |
An integer value for the number of active samples. |
nmarkers.in |
An integer value for the number of active markers. |
pop |
A character vector relating genotypes to populations. This information is obtained from the FID (1st) column in the fam file. |
id |
A character vector mapping genotypes to samples. This information is obtained from the IID (2nd) column in the fam file. |
id.in |
A logical vector indicating active chromosome alleles. By default, all chromosomes are set to TRUE. |
sire |
A character vector indicating sire names, as provided in the SID (3rd) column of the fam file. |
dam |
A character vector indicating dam names, as provided in the DID (4th) column of the fam file. |
sex |
A character vector indicating individual sex, as provided in the SEX (5th) column of the fam file. Codes are converted as follows: 0 = NA, 1 = Male and 2 = Female. |
chr |
A character vector indicating chromosome identity for each marker. |
marker |
A character vector containing marker names. |
marker.in |
A logical vector indicating active markers. By default, all markers are set to TRUE. |
cm |
A numeric vector with genetic positions for markers. This information is obtained from the third column of the bim file. If genetic positions are absent (coded as "0"), they are approximated from physical positions assuming 1 Mb ~ 1 cM. |
bp |
A numeric vector with physical positions for markers. |
A0 |
A character vector with reference alleles. For convenience, this information is obtained from the 6th column of the bim file. If "–keep-allele-order" is not used while generating the PLINK binary file, A0 will correspond to the major allele. |
A1 |
A character vector with alternative alleles. As for A0, if "–keep-allele-order" is not used A1 will correspond to the minor allele. |
plink |
A character value giving the pathway to the binary genotype matrix. |
Author(s)
Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>
Examples
# #### DO NOT RUN IF NOT NECESSARY ###
#
# # Copy phase data in the current working directory
# exfiles <- ghap.makefile(dataset = "example",
# format = "plink",
# verbose = TRUE)
# file.copy(from = exfiles, to = "./")
#
# ### RUN ###
#
# # Load data using prefix
# plink <- ghap.loadplink("example")
#
# # Load data using file names
# plink <- ghap.loadplink(bed.file = "example.bed",
# bim.file = "example.bim",
# fam.file = "example.fam")