calculateEvianMLE {evian}R Documentation

Profile likelihood calculation using regression models

Description

This is the function that calculates profileLikelihood for a single SNP. The main function evian calls this function repeatedly to obtain results for multiple SNPs.

Usage

  calculateEvianMLE(snp, formula_tofit, model, data, bim, lolim, hilim,
                    m, bse, k, robust, family, plinkCC)

Arguments

snp

a string specifying the SNP of interests to be calculated.

formula_tofit

a formula object of the genetic model. The model should be formatted as y~nuisance parameters. The parameter of interest should not be included here.

model

a string specifying the mode of inheritance parameterization:
additive, dominant, recessive, or overdominance. See details.

data

data frame; read from the argument data in the main function evian. It should contain the SNP ID specified in the snp argument as a column name.

bim

data frame; read from from the argument bim in the main function evian. Provides allele information (base pair, effect/reference alleles) for the SNP of interest.

lolim

numeric; the lower limit for the grid or the minimum value of the regression parameter \beta used to calculate the likelihood function.

hilim

numeric; the upper limit for the grid or the maximum value of the regression parameter \beta used to calculate the likelihood funciton.

m

numeric; the density of the grid at which to compute the standardized likelihood function. A beta grid is defined as the grid of values for the SNP parameter used to evaluate the likelihood function.

bse

numeric; the number of beta standard errors to utilize in constraining the beta grid limits. Beta grid is evaluated at \beta +/- bse*s.e.

k

numeric or numeric vector; The strength of evidence criterion k. Reads from the input of kcutoff from the main evian function

robust

logical; if TRUE, then a robust adjustment is applied to the likelihood function to account for the cluster nature in the data. See robust_forCluster.

family

the link function for glm.

plinkCC

A boolean type that specifies how case/control are coded. case/control were coded 1/0 if it is FALSE, and were coded 2/1 if TRUE.

Details

calculateEvianMLE calculates the profile likelihood for a single SNP. A proper grid range is first established for \beta then the standardized profile likelihood is evaluated at each of the m cuts uniformly spread across the grid. Based on the standardized profile likelihood, the MLE for \beta is computed as well as the likelihood intervals for each value of k provided.
For different genetic models, their coding schemes are shown as below:

   Additive
 AA  0
 AB  1
 BB  2

   Dominant
 AA  0
 AB  1
 BB  1

   Recessive
 AA  0
 AB  0
 BB  1

 Overdominance model
        A  D
    AA  0  0
    AB  1  1
    BB  2  0

Specifically for the overdominance model, the column of interest is the D column.

Value

This function outputs a list containg 4 elements that can be directly accessed using '$' operator.

theta

numeric vector; It stores all m \beta values that used to estimate the standardized profile likelihood.

profile.lik.norm

numeric vector; the corresponding m standardized profile likelihood value at each of the \beta values in theta. If robust=TRUE, then the values will be adjusted by the robust factor.

k_cutoff

numeric vector; It specifies which k-cutoff had been used in the calculation, ordered from the smallest k to the largest k.

SummaryStats

data frame; contains the summary statistics of the profile likelihood calculation. It contains the following columns:

  • mle: the estimates for SNP effect with respect to the effective allele

  • maxlr: maximum likelihood ratio in the beta grid defined by lolim and hilim

  • AF: allele frequency for the effective allele

  • SNP: SNP ID

  • bp: base pair position from the bim input

  • effect, ref: the effective allele and the other allele from the bim input

  • robustFactor: robust factor calculated, set to 1 if robust=FALSE.

  • lo_1, hi_1, lo_2, hi_2...: the lower and upper bound of the likelihood intervals for the kth cut-off in k_cutoff.

Note

When lolim or hilim are NOT defined, then the boundaries of the beta grid will be determined by the default bse=5, or a bse defined by the user. Otherwise, the user can define the exact beta grid boundaries using lolim and hilim.

In some cases the beta grid (using bse or lolim,hilim) may need to be increased substantially (bse as large as 15) if covariates are present in the formula. This is automatically dealt by the current function, but contributes to longer computation time to find the appropriate ranges. Estimation may become inaccurate with large number of correlated covariates, which is a known limitation of profile likelihoods.

References

Strug, L. J., Hodge, S. E., Chiang, T., Pal, D. K., Corey, P. N., & Rohde, C. (2010). A pure likelihood approach to the analysis of genetic association data: an alternative to Bayesian and frequentist analysis. Eur J Hum Genet, 18(8), 933-941. doi:10.1038/ejhg.2010.47

Strug, L. J., & Hodge, S. E. (2006). An alternative foundation for the planning and evaluation of linkage analysis. I. Decoupling "error probabilities" from "measures of evidence". Hum Hered, 61(3), 166-188. doi:10.1159/000094709

Royall, R. (1997). Statistical Evidence: A Likelihood Paradigm. London, Chapman and Hall.


[Package evian version 2.1.0 Index]