R: Maximum-likelihood method

mr_maxlik {MendelianRandomization}

R Documentation

Maximum-likelihood method

Description

The mr_maxlik function implements the maximum-likelihood method introduced by Burgess et al (2013).

Usage

mr_maxlik(
  object,
  model = "default",
  correl = FALSE,
  psi = 0,
  distribution = "normal",
  alpha = 0.05,
  ...
)

## S4 method for signature 'MRInput'
mr_maxlik(
  object,
  model = "default",
  correl = FALSE,
  psi = 0,
  distribution = "normal",
  alpha = 0.05,
  ...
)

Arguments

`object`	An `MRInput` object.
`model`	What type of model should be used: `"default"`, `"random"` or `"fixed"`. The method naturally estimates a fixed-effect model, assuming that the same causal effect is estimated by each of the genetic variants. However, if there is heterogeneity in the causal estimates of the different variants, then confidence intervals under a fixed-effect model will be overly narrow. The random-effects model adds additional uncertainty by multiplying the standard error by the square-root of the likelihood ratio heterogeneity statistic divided by the number of genetic variants less one (unless this quantity is less than 1, in which case no modification to the standard error is made). This parallels the residual standard error in a regression model (the Cochran Q heterogeneity test statistic is equal to the square of the RSE multiplied by the number of genetic variants less one). The default setting (`"default"`) is to use a fixed-effect model with 3 genetic variants or fewer, and otherwise to use a random-effects model.
`correl`	If the genetic variants are correlated, then this correlation can be accounted for. The matrix of correlations between must be provided in the `MRInput` object: the elements of this matrix are the correlations between the individual variants (diagonal elements are 1).
`psi`	The correlation between the association with the exposure and the association with the outcome for each variant resulting from sample overlap.
`distribution`	The type of distribution used to calculate the confidence intervals, can be `"normal"` (the default option) or `"t-dist"`.
`alpha`	The significance level used to calculate the confidence interval. The default value is 0.05.
`...`	Additional arguments to be passed to the optimization method.

Details

A likelihood function is defined by assuming that the summarized data for each genetic variant are normally distributed. A bivariate normal distribution is assumed for the associations of each genetic variant with the exposure and with the outcome. The mean of the association with the outcome is taken as the mean association with the exposure multiplied by the causal effect parameter.

Thus, if there are K genetic variants, then K+1 parameters are estimated by the method: one for each gene–exposure association, plus the causal parameter. If the number of genetic variants is large, then maximization of this function may be an issue. If the maximum likelihood estimate substantially differs from the inverse-variance weighted estimate, this may indicate that convergence has not occurred in the optimization algorithm.

The variance-covariance matrices for the bivariate normal distributions are obtained from the standard error estimates provided. The correlation psi between genetic associations with the exposure and with the outcome due to sample overlap can be specified; its default value is zero.

Two features why this method may be preferred over the inverse-variance weighted method are the incorporation in the model of uncertainty in the genetic associations with the exposure, and of correlation between the genetic association estimates with exposure and outcome for each variant. The method is implemented both for uncorrelated and correlated genetic variants. It can also be used for a single genetic variant.

The original version of the maximum-likelihood method assumed that all genetic variants identify the same causal estimate; a fixed-effect model. The causal estimate may be overly precise if the fixed-effect model is incorrect and there is substantial heterogeneity in the causal estimates from the different variants. The random-effects analysis implemented here is an ad hoc solution to the problem of heterogeneity, but one that should result in reasonable confidence intervals that incorporate this heterogeneity.

Value

The output from the function is an MaxLik object containing:

`Model`	A character string giving the type of model used (`"fixed"`, `"random"`, or `"default"`).
`Exposure`	A character string giving the name given to the exposure.
`Outcome`	A character string giving the name given to the outcome.
`Correlation`	The matrix of genetic correlations.
`Psi`	The correlation between genetic associations with the exposure and with the outcome.
`Estimate`	The value of the causal estimate.
`StdError`	Standard error of the causal estimate.
`CILower`	The lower bound of the causal estimate based on the estimated standard error and the significance level provided.
`CIUpper`	The upper bound of the causal estimate based on the estimated standard error and the significance level provided.
`Alpha`	The significance level used when calculating the confidence intervals.
`Pvalue`	The p-value associated with the estimate (calculated as Estimate/StdError as per Wald test) using a normal or t-distribution (as specified in `distribution`).
`SNPs`	The number of genetic variants (SNPs) included in the analysis.
`RSE`	The estimated residual standard error from the regression model (always equal to 1, as a fixed-effect model is required.
`Heter.Stat`	Heterogeneity statistic (likelihood ratio statistic) and associated p-value: the null hypothesis is that all genetic variants estimate the same causal parameter; rejection of the null is an indication that one or more variants may be pleiotropic.

References

Stephen Burgess, Adam S Butterworth, Simon G Thompson. Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic Epidemiology 2013; 37:658-665. doi: 10.1002/gepi.21758.

Examples

mr_maxlik(mr_input(bx = ldlc[1:10], bxse = ldlcse[1:10],
  by = chdlodds[1:10], byse = chdloddsse[1:10]))
mr_maxlik(mr_input(bx = ldlc[1:10], bxse = ldlcse[1:10],
  by = chdlodds[1:10], byse = chdloddsse[1:10]), psi=0.2)
mr_maxlik(mr_input(calcium, calciumse, fastgluc, fastglucse, corr=calc.rho))
  ## correlated variants

[Package MendelianRandomization version 0.10.0 Index]