R: General QTL function that allows for co-factors, completely...

QTLscan {polyqtlR}

R Documentation

General QTL function that allows for co-factors, completely randomised block designs and the possibility to derive LOD thresholds using a permutation test

Description

Function to run QTL analysis using IBD probabilties given (possibly replicated) phenotypes, assuming randomised experimental design

Usage

QTLscan(
  IBD_list,
  Phenotype.df,
  genotype.ID,
  trait.ID,
  block = NULL,
  cofactor_df = NULL,
  allelic_interaction = FALSE,
  folder = NULL,
  filename.short,
  prop_Pheno_rep = 0.5,
  perm_test = FALSE,
  N_perm.max = 1000,
  alpha = 0.05,
  gamma = 0.05,
  ncores = 1,
  log = NULL,
  verbose = TRUE,
  ...
)

Arguments

`IBD_list`	List of IBD probabilities
`Phenotype.df`	A data.frame containing phenotypic values
`genotype.ID`	The colname of `Phenotype.df` that contains the offspring identifiers (F1 names)
`trait.ID`	The colname of `Phenotype.df` that contains the response variable to use in the model
`block`	The blocking factor to be used, if any (must be colname of `Phenotype.df`). By default `NULL`, in which case no blocking structure (for unreplicated experiments)
`cofactor_df`	A 3-column data frame of co-factor(s); column 1 gives the numeric linkage group identifier(s), column 2 specifies the cM position of the co-factor(s), column 3 specifies whether the QTL was fitted using "a" = additive effects or "f" = full allelic interactions (note that any other symbol for the full model will also be accepted, as long as it is not "a"). For backward compatibility with package versions <= 0.0.9, it is possible to just supply the first two columns, in which case an additive-effects model is assumed for each cofactor (so, a third column will be automatically filled with "a"). By default `cofactor_df = NULL`, in which case no co-factors are included in the analysis.
`allelic_interaction`	The QTL detection model can be for additive main effects only (by default `allelic_interaction = FALSE`). If `TRUE`, then the full model is used (i.e. all possible genotype combinations are included as predictors in the model). This runs the risk of overfitting, especially if double reduction was also allowed. Both types of analyses can ideally be performed and compared. Note that if IBD probabilities were estimated using the "heuristic" method rather than the HMM method (see `estimate_IBD`), then IBDs are actually haplotype probabilities rather than genotype probabilities, meaning that allelic interaction effects cannot be included in the model.
`folder`	If markers are to be used as co-factors, the path to the folder in which the imported IBD probabilities is contained can be provided here. By default this is `NULL`, if files are in working directory.
`filename.short`	If TetraOrigin was used and co-factors are being included, the shortened stem of the filename of the `.csv` files containing the output of TetraOrigin, i.e. without the tail "_LinkageGroupX_Summary.csv" which is added by default to all output of TetraOrigin.
`prop_Pheno_rep`	The minimum proportion of phenotypes represented across blocks. If less than this, the individual is removed from the analysis. If there is incomplete data, the missing phenotypes are imputed using the mean values across the recorded observations.
`perm_test`	Logical, by default `FALSE`. If `TRUE`, a permutation test will be performed to determine a genome-wide significance threshold.
`N_perm.max`	The maximum number of permutations to run if `perm_test` is `TRUE`; by default this is 1000.
`alpha`	The P-value to be used in the selection of a threshold if `perm_test` is `TRUE`, by default 0.05 (i.e. the 0.95 quantile).
`gamma`	The width of the confidence intervals used around the permutation test threshold using the approach of Nettleton & Doerge (2000), by default 0.05.
`ncores`	Number of cores to use if parallel computing is required. Works both for Windows and UNIX (using `doParallel`). Use `parallel::detectCores()` to find out how many cores you have available.
`log`	Character string specifying the log filename to which standard output should be written. If `NULL` log is send to stdout.
`verbose`	Logical, by default `TRUE`. Should messages be printed during running?
`...`	Arguments passed to `plot`

Value

A nested list; each list element (per linkage group) contains the following items:

QTL.res: Single matrix of QTL results with columns chromosome, position, LOD, adj.r.squared and PVE (percentage variance explained).
Perm.res: If perm_test = FALSE, this will be NULL. Otherwise, Perm.res contains a list of the results of the permutation test, with list items "quantile","threshold" and "scores". Quantile refers to which quantile of scores was used to determine the threshold. Note that scores are each of the maximal LOD scores across the entire genome scan per permutation, thus returning a genome-wide threshold rather than a chromosome-specific threshold. If the latter is preferred, restricting the IBD_list to a single chromosome and re-running the permutation test will provide the desired threshold.
Residuals: If a blocking factor or co-factors are used, this is the (named) vector of residuals used as input for the QTL scan. Otherwise, this is the set of (raw) phenotypes used in the QTL scan.
Map: Original map of genetic marker positions upon which the IBDs were based, most often used for adding rug of marker positions to QTL plots.
LG_names: Names of the linkage groups
allelic_interaction: Whether argument allelic_interaction was TRUE or FALSE in the QTL scan

Examples

data("IBD_4x","Phenotypes_4x")
qtl_LODs.4x <- QTLscan(IBD_list = IBD_4x,
                       Phenotype.df = Phenotypes_4x,
                       genotype.ID = "geno",
                       trait.ID = "pheno",
                       block = "year")

[Package polyqtlR version 0.1.1 Index]