ghap.loadphase {GHap} | R Documentation |
Load binary phased genotype data
Description
This function loads binary phased genotype data and converts them into a native GHap.phase object.
Usage
ghap.loadphase(input.file = NULL,
samples.file = NULL,
markers.file = NULL,
phaseb.file = NULL,
ncores = 1, verbose = TRUE)
Arguments
If all input files share the same prefix, the user can use the following shortcut option:
input.file |
Prefix for input files. |
For backward compatibility, the user can still point to input files separately:
samples.file |
Individual information. |
markers.file |
Variant map information. |
phaseb.file |
Binary phased genotype matrix, such as supplied by the |
To turn loading progress-tracking on or off, or use multiple cores, please use:
ncores |
A numerical value specfying the number of cores to use while loading the input files (default = 1). |
verbose |
A logical value specfying whether log messages should be printed (default = TRUE). |
Value
The returned GHap.phase object is a list with components:
nsamples |
An integer value for the sample size. |
nmarkers |
An integer value for the number of markers. |
nsamples.in |
An integer value for the number of active samples. |
nmarkers.in |
An integer value for the number of active markers. |
pop |
A character vector relating chromosome alleles to populations. This information is obtained from the first column of the sample file. |
id |
A character vector mapping chromosome alleles to samples. This information is obtained from the second column of the sample file. |
id.in |
A logical vector indicating active chromosome alleles. By default, all chromosomes are set to TRUE. |
sire |
A character vector indicating sire names, as provided in the third column of the sample file (optional). |
dam |
A character vector indicating dam names, as provided in the fourth column of the sample file (optional). |
sex |
A character vector indicating individual sex, as provided in the fifth column of the sample file (optional). Codes are converted as follows: 0 = NA, 1 = Male and 2 = Female. |
chr |
A character vector indicating chromosome identity for each marker. |
marker |
A character vector containing marker names. This information is obtained from the second column of the marker map file. |
marker.in |
A logical vector indicating active markers. By default, all markers are set to TRUE. |
cm |
A numeric vector with genetic positions for markers. This information is obtained from the third column of the marker map file if it contains 6 columns. Otherwise, if the map file contains only 5 columns, genetic positions are considered absent and approximated from physical positions (in this case assumed to be the third column) as 1 Mb ~ 1 cM. |
bp |
A numeric vector with marker positions. This information is obtained from the third column of the marker map file if it contains 5 columns, or from the fourth column if it contains 6 columns. |
A0 |
A character vector with reference alleles. This information is obtained from the fourth column of the marker map file in case it contains 5 columns, or from the fifth column if it contains 6 columns. |
A1 |
A character vector with alternative alleles. This information is obtained from the last column of the marker map file. |
phase |
A character value giving the pathway to the binary phased genotype matrix. |
Author(s)
Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>
Marco Milanesi <marco.milanesi.mm@gmail.com>
Examples
# #### DO NOT RUN IF NOT NECESSARY ###
#
# # Copy the example data in the current working directory
# exfiles <- ghap.makefile(dataset = "example",
# format = "phase",
# verbose = TRUE)
# file.copy(from = exfiles, to = "./")
#
# ### RUN ###
#
# # Load data using prefix
# phase <- ghap.loadphase(input.file = "example")
#
# # Load data using file names
# phase <- ghap.loadphase(samples.file = "example.samples",
# markers.file = "example.markers",
# phaseb.file = "example.phaseb")