R: Read a pedigree from file

readPed {pedtools}

R Documentation

Read a pedigree from file

Description

Reads a text file in pedigree format, or something fairly close to it.

Usage

readPed(
  pedfile,
  colSep = "",
  header = NA,
  famid_col = NA,
  id_col = NA,
  fid_col = NA,
  mid_col = NA,
  sex_col = NA,
  marker_col = NA,
  locusAttributes = NULL,
  missing = 0,
  sep = NULL,
  colSkip = NULL,
  sexCodes = NULL,
  addMissingFounders = FALSE,
  validate = TRUE,
  ...
)

Arguments

`pedfile`	A file name
`colSep`	A column separator character, passed on as the `sep` argument of `read.table()`. The default is to separate on white space, that is, one or more spaces, tabs, newlines or carriage returns. (Note: the parameter `sep` is used to indicate allele separation in genotypes.)
`header`	A logical. If NA, the program will interpret the first line as a header line it contains both "id" and "sex" as part of some entries (ignoring case).
`famid_col`	Index of family ID column. If NA, the program looks for a column named "famid" (ignoring case).
`id_col`	Index of individual ID column. If NA, the program looks for a column named "id" (ignoring case).
`fid_col`	Index of father ID column. If NA, the program looks for a column named "fid" (ignoring case).
`mid_col`	Index of mother ID column. If NA, the program looks for a column named "mid" (ignoring case).
`sex_col`	Index of column with gender codes (0 = unknown; 1 = male; 2 = female). If NA, the program looks for a column named "sex" (ignoring case). If this is not found, genders of parents are deduced from the data, leaving the remaining as unknown.
`marker_col`	Index vector indicating columns with marker alleles. If NA, all columns to the right of all pedigree columns are used. If `sep` (see below) is non-NULL, each column is interpreted as a genotype column and split into separate alleles with `strsplit(..., split = sep, fixed = TRUE)`.
`locusAttributes`	Passed on to `setMarkers()` (see explanation there).
`missing`	Passed on to `setMarkers()` (see explanation there).
`sep`	Passed on to `setMarkers()` (see explanation there).
`colSkip`	Columns to skip, given as a vector of indices or columns names. If given, these columns are removed directly after `read.table()`, before any other processing.
`sexCodes`	A list with optional entries "male", "female" and "unknown", indicating how non-default entries in the `sex` column should be interpreted. Default values: male = 1, female = 2, unknown = 0.
`addMissingFounders`	A logical. If TRUE, any parent not included in the `id` column is added as a founder of corresponding sex. By default, missing founders result in an error.
`validate`	A logical indicating if the pedigree structure should be validated.
`...`	Further parameters passed on to `read.table()`, e.g. `comment.char` and `quote`.

Details

If there are no headers, and no column information is provided by the user, the program assumes the following column order:

family ID (optional; guessed from the data)
individual ID
father's ID
mother's ID
sex
marker data (remaining columns)

Reading SNP data

Adding the argument locusAttributes = "snp-AB", sets all markers to be equifrequent SNPs with alleles A and B. Moreover, the letters A and B may be replaced by other single-character letters or numbers, e.g., "snp-12" gives alleles 1 and 2.

Value

A ped object or a list of such.

Examples


tf = tempfile()

### Write and read a trio
trio = data.frame(id = 1:3, fid = c(0,0,1), mid = c(0,0,2), sex = c(1,2,1))
write.table(trio, file = tf, row.names = FALSE)
readPed(tf)

# With marker data in one column
trio.marker = cbind(trio, M = c("1/1", "2/2", "1/2"))
write.table(trio.marker, file = tf, row.names = FALSE)
readPed(tf)

# With marker data in two allele columns
trio.marker2 = cbind(trio, M.1 = c(1,2,1), M.2 = c(1,2,2))
write.table(trio.marker2, file = tf, row.names = FALSE)
readPed(tf)

### Two singletons in the same file
singles = data.frame(id = c("S1", "S2"),
                     fid = c(0,0), mid = c(0,0), sex = c(2,1),
                     M = c("9/14.2", "9/9"))
write.table(singles, file = tf, row.names = FALSE)
readPed(tf)

### Two trios in the same file
trio2 = cbind(famid = rep(c("trio1", "trio2"), each = 3), rbind(trio, trio))

# With column names
write.table(trio2, file = tf, col.names = TRUE, row.names = FALSE)
readPed(tf)

# Without column names
write.table(trio2, file = tf, col.names = FALSE, row.names = FALSE)
readPed(tf)

### With non-standard `sex` codes
trio3 = data.frame(id = 1:3, fid = c(0,0,1), mid = c(0,0,2),
                   sex = c("male","female","?"))
write.table(trio3, file = tf, row.names = FALSE)
readPed(tf, sexCodes = list(male = "male", female = "female", unknown = "?"))

# Cleanup
unlink(tf)

[Package pedtools version 2.7.0 Index]