R: Heterogeneity-penalized method

mr_hetpen {MendelianRandomization}

R Documentation

Heterogeneity-penalized method

Description

Heterogeneity-penalized model-averaging method for efficient modal-based estimation.

Usage

mr_hetpen(
  object,
  prior = 0.5,
  CIMin = -1,
  CIMax = 1,
  CIStep = 0.001,
  alpha = 0.05
)

## S4 method for signature 'MRInput'
mr_hetpen(
  object,
  prior = 0.5,
  CIMin = -1,
  CIMax = 1,
  CIStep = 0.001,
  alpha = 0.05
)

Arguments

`object`	An `MRInput` object.
`prior`	The prior probability of a genetic variant being a valid instrument (default is 0.5).
`CIMin`	The smallest value to use in the search to find the confidence interval (default is -1).
`CIMax`	The largest value to use in the search to find the confidence interval (default is +1).
`CIStep`	The step size to use in the search to find the confidence interval (default is 0.001). The confidence interval is determined by a grid search algorithm. Using the default settings, we calculate the likelihood at all values from -1 to +1 increasing in units of 0.001. If this range is too large or the step size is too small, then the grid search algorithm will take a long time to converge.
`alpha`	The significance level used to calculate the confidence interval. The default value is 0.05.

Details

This method was developed as a more efficient version of the mode-based estimation method of Hartwig et al. It proceeds by evaluating weights for all subsets of genetic variants (excluding the null set and singletons). Subsets receive greater weight if they include more variants, but are severely downweighted if the variants in the subset have heterogeneous causal estimates. As such, the method will identify the subset with the largest number (by weight) of variants having similar causal estimates.

Confidence intervals are evaluated by calculating a log-likelihood function, and finding all points within a given vertical distance of the maximum of the log-likelihood function (which is the causal estimate). As such, if the log-likelihood function is multimodal, then the confidence interval may include multiple disjoint ranges. This may indicate the presence of multiple causal mechanisms by which the exposure may influence the outcome with different magnitudes of causal effect. As the confidence interval is determined by a grid search, care must be taken when chosing the minimum (CIMin) and maximum (CIMax) values in the search, as well as the step size (CIStep). The default values will not be suitable for all applications.

The method should give consistent estimates as the sample size increases if a weighted plurality of the genetic variants are valid instruments. This means that the largest group of variants with the same causal estimate in the asymptotic limit are the valid instruments.

The current implementation of the method evaluates a weight and an estimate for each of the subsets of genetic variants. This means that the method complexity doubles for each additional genetic variant included in the analysis. Currently, the method provides a warning message when used with 25+ variants, and fails to run with 30+.

Value

The output from the function is an MRHetPen object containing:

`Exposure`	A character string giving the name given to the exposure.
`Outcome`	A character string giving the name given to the outcome.
`Prior`	The value of the bandwidth factor.
`Estimate`	The value of the causal estimate.
`CIRange`	The range of values in the confidence interval based on a grid search between the minimum and maximum values for the causal effect provided.
`CILower`	The lower limit of the confidence interval. If the confidence interval contains multiple ranges, then lower limits of all ranges will be reported.
`CIUpper`	The upper limit of the confidence interval. If the confidence interval contains multiple ranges, then upper limits of all ranges will be reported.
`CIMin`	The smallest value used in the search to find the confidence interval.
`CIMax`	The largest value used in the search to find the confidence interval.
`CIStep`	The step size used in the search to find the confidence interval.
`Alpha`	The significance level used when calculating the confidence intervals.
`SNPs`	The number of genetic variants (SNPs) included in the analysis.

References

Stephen Burgess, Verena Zuber, Apostolos Gkatzionis, Christopher N Foley. Improving on a modal-based estimation method: model averaging for consistent and efficient estimation in Mendelian randomization when a plurality of candidate instruments are valid. bioRxiv 2017. doi: 10.1101/175372.

Examples

mr_hetpen(mr_input(bx = ldlc[1:10], bxse = ldlcse[1:10], by = chdlodds[1:10],
   byse = chdloddsse[1:10]), CIMin = -1, CIMax = 5, CIStep = 0.01)

[Package MendelianRandomization version 0.10.0 Index]