| Genetic regression {noia} | R Documentation | 
Linear and Multilinear Genetic Regressions
Description
The regression aims at estimating genetic effects from a population in which the genotypes and phenotypes are known.
Usage
linearRegression(phen, gen=NULL, genZ=NULL, 
    reference="noia", max.level=NULL, max.dom=NULL, fast=FALSE)
multilinearRegression(phen, gen=NULL, genZ=NULL, 
    reference="noia", max.level=NULL, max.dom=NULL, fast=FALSE, 
    e.unique=FALSE, start.algo = "linear", start.values=NULL, 
    robust=FALSE, bilinear.steps=1, ...)
Arguments
| phen | The vector of individual phenotypes measured in the population. | 
| gen |  The matrix of individual genotypes in the population, one column per locus. See  | 
| genZ |  The matrix of individual genotypic probabilities in the population, 3 columns per locus, corresponding of the probability of each of the 3 genotypes (the sum must be 1). Not necessary if  | 
| reference |  The reference point from which the regression is performed. By default, the  | 
| max.level | Maximum level of interactions. | 
| max.dom |  Maximum level for dominance effects. Does not have any effect if >=  | 
| fast |  This "fast" algorithm should be used when (i) the number of loci is high (> 8) and (ii) there are uncertainties in the dataset (missing values or Haley-Knott regression). This algorithm computes the regression matrix directly function, i.e. without computing  | 
| e.unique | Whether the multilinear term is the same for all pairs. | 
| start.algo |  Algorithm used to compute the starting values. Can be  | 
| start.values | Vector of starting values. | 
| robust | Tries sequentially all starting values algorithms. | 
| bilinear.steps |  Number of steps. Ignored if  | 
| ... |  Extra parameters to the non-linear regression function  | 
Details
If a gen data set is provided, it will be turned into a genZ. Missing data (unknown genotypes) are considered as loci for which genotypic probabilities are identical to the genotypic frequencies in the population. 
The algebraic framework is described extensively in Alvarez-Castro & Carlborg 2007. The default reference point ("noia") provides an orthogonal decomposition of genetic effects in the 1-locus case, whatever the genotypic frequencies. It remains a good approximation of orthogonality in the multi-locus case if linkage disequilibrium is small. Other optional reference points are those of the "G2A" model (Zeng et al. 2005), and the unweighted regression model "UWR" (Cheverud & Routman, 1995). Several key populations can be taken as reference as well: "F2", "F1", "Finf" (F infinity), and the two "parental" homozygous populations "P1" and "P2". 
The multilinear model for genetic interactions is an alternative way to model epistatic interactions between at least two loci (see Hansen & Wagner 2001). The computation of multilinear estimates requires a non-linear regression step that relies on the nls function. Providing good starting values for the non-linear regression is a key to ensure convergence, and different algorithms are provided, that can be specified by the "start.algo" option. "linear" performs a linear regression and approximates the genetic effects from it, while "multilinear" performs a simpler multilinear regression (without dominance) to initialize the genetic effects. "subset" estimate all genetic effects from a random subset (50%) of the population, and "bilinear" estimate alternatively marginal and epistatic effects. 
Value
linearRegression and multilinearRegression return an object of class "noia.linear" or "noia.multilinear", both having their own print methods: print.noia.linear and print.noia.multilinear. 
Author(s)
Arnaud Le Rouzic
References
Alvarez-Castro JM, Carlborg O. (2007). A unified model for functional and statistical epistasis and its application in quantitative trait loci analysis. Genetics 176(2):1151-1167.
Alvarez-Castro JM, Le Rouzic A, Carlborg O. (2008). How to perform meaningful estimates of genetic effects. PLoS Genetics 4(5):e1000062.
Cheverud JM, Routman, EJ. (1995). Epistasis and its contribution to genetic variance components. Genetics 139:1455-1461.
Hansen TF, Wagner G. (2001) Modeling genetic architecture: A multilinear theory of gene interactions. Theoretical Population Biology 59:61-86.
Le Rouzic A, Alvarez-Castro JM. (2008). Estimation of genetic effects and genotype-phenotype maps. Evolutionary Bioinformatics 4.
Zeng ZB, Wang T, Zou W. (2005). Modelling quantitative trait loci and interpretation of models. Genetics 169: 1711-1725.
See Also
geneticEffects, GPmap, varianceDecomposition. 
Examples
set.seed(123456789)
map <- c(0.25, -0.75, -0.75, -0.75, 2.25, 2.25, -0.75, 2.25, 2.25)
pop <- simulatePop(map, N=500, sigmaE=0.2, type="F2")
# Regressions
linear <- linearRegression(phen=pop$phen, gen=cbind(pop$Loc1, pop$Loc2))
multilinear <- multilinearRegression(phen=pop$phen, 
    gen=cbind(pop$Loc1, pop$Loc2))
# Linear effects, associated variances and stderr
linear
# Multilinear effects
multilinear