CheckDiemFormat {diemr}R Documentation

diem input file checker


Checks format of files with genotype data.


CheckDiemFormat(files, ChosenInds, ploidy)



character vector with paths to files with genotypes.


numeric vector of indices of individuals to be included in the analysis.


list of length equal to length of files. Each element of the list contains a numeric vector with ploidy numbers for all individuals specified in the ChosenInds.


The input file must have genotypes of one marker for all individuals on one line. The line must start with a letter "S" and contain only characters "_" or "U" for unknown genotypes or a third/fourth allele, "0" for homozygots for allele 1, "1" for heterozygots, and "2" for homozygots for allele 2. Check the vignette with browseVignettes(package = "diemr") for the example of the input format.

Ploidies must be given as a list with each element corresponding to a genomic compartment (aka a file). For each compartment, the numeric vector specifying ploidies of all individuals chosen for the specific analysis must be given.


Returns invisible TRUE if all files are executable by diem. Exits with informative error messages otherwise, specifying file names and lines with potential problems. When too many lines contain problems, the first six are given.


# set up input genotypes file names, ploidies and selection of individual samples
inputFile <- system.file("extdata", "data6x3.txt", package = "diemr")
ploidies <- list(c(2, 1, 2, 2, 2, 1))
inds <- 1:6

# check input data
CheckDiemFormat(files = inputFile, ploidy = ploidies, ChosenInds = inds)
#  File check passed: TRUE
#  Ploidy check passed: TRUE

