R: Fit Mixed-MDMR models

mixed.mdmr {MDMR}

R Documentation

Fit Mixed-MDMR models

Description

mixed.mdmr allows users to conduct multivariate distance matrix regression (MDMR) in the context of a (hierarchically) clustered sample without inflating Type-I error rates as a result of the violation of the independence assumption. This is done by invoking a mixed-effects modeling framework, in which clustering/grouping variables are specified as random effects and the covariate effects of interest are fixed effects. The input to mixed.mdmr largely reflects the input of the lmer function from the package lme4 insofar as the specification of random and fixed effects are concerned (see Arguments for details). Note that this function simply controls for the random effects in order to test the fixed effects; it does not facilitate point estimation or inference on the random effects.

Usage

mixed.mdmr(fmla, data, D = NULL, G = NULL, use.ssd = 1,
  start.acc = 1e-20, ncores = 1)

Arguments

`fmla`	A one-sided linear formula object describing both the fixed-effects and random-effects part of the model, beginning with an ~ operator, which is followed by the terms to include in the model, separated by + operators. Random-effects terms are distinguished by vertical bars (\|) separating expressions for design matrices from grouping factors. Two vertical bars (\|\|) can be used to specify multiple uncorrelated random effects for the same grouping variable.
`data`	A mandatory data frame containing the variables named in formula.
`D`	Distance matrix computed on the outcome data. Can be either a matrix or an R `dist` object. Either `D` or `G` must be passed to `mdmr()`.
`G`	Gower's centered similarity matrix computed from `D`. Either `D` or `G` must be passed to `mdmr`.
`use.ssd`	The proportion of the total sum of squared distances (SSD) that will be targeted in the modeling process. In the case of non-Euclidean distances, specifying `use.ssd` to be slightly smaller than 1.00 (e.g., 0.99) can substantially lower the computational burden of `mixed.mdmr` while maintaining well-controlled Type-I error rates and only sacrificing a trivial amount of power. In the case of Euclidean distances the computational burden of `mixed.mdmr` is small, so `use.ssd` should be set to 1.00.
`start.acc`	Starting accuracy of the Davies (1980) algorithm implemented in the `davies` function in the `CompQuadForm` package (Duchesne & De Micheaux, 2010) that `mdmr()` uses to compute MDMR p-values.
`ncores`	Integer; if `ncores` > 1, the `parallel` package is used to speed computation. Note: Windows users must set `ncores = 1` because the `parallel` pacakge relies on forking. See `mc.cores` in the `mclapply` function in the `parallel` pacakge for more details.

Value

An object with six elements and a summary function. Calling summary(mixed.mdmr.res) produces a data frame comprised of:

`Statistic`	Value of the corresponding MDMR test statistic
`Numer DF`	Numerator degrees of freedom for the corresponding effect
`p-value`	The p-value for each effect.

In addition to the information in the three columns comprising summary(res), the res object also contains:

p.prec

A data.frame reporting the precision of each p-value. If analytic p-values were computed, these are the maximum error bound of the p-values reported by the davies function in CompQuadForm. If permutation p-values were computed, it is the standard error of each permutation p-value.

Note that the printed output of summary(res) will truncate p-values to the smallest trustworthy values, but the object returned by summary(res) will contain the p-values as computed. The reason for this truncation differs for analytic and permutation p-values. For an analytic p-value, if the error bound of the Davies algorithm is larger than the p-value, the only conclusion that can be drawn with certainty is that the p-value is smaller than (or equal to) the error bound.

Author(s)

Daniel B. McArtor (dmcartor@gmail.com) [aut, cre]

References

Davies, R. B. (1980). The Distribution of a Linear Combination of chi-square Random Variables. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(3), 323-333.

Duchesne, P., & De Micheaux, P. L. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4), 858-862.

McArtor, D. B. (2017). Extending a distance-based approach to multivariate multiple regression (Doctoral Dissertation).

Examples

data("clustmdmrdata")

# Get distance matrix
D <- dist(Y.clust)

# Regular MDMR without the grouping variable
mdmr.res <- mdmr(X = X.clust[,1:2], D = D, perm.p = FALSE)

# Results look significant
summary(mdmr.res)

# Account for grouping variable
mixed.res <- mixed.mdmr(~ x1 + x2 + (x1 + x2 | grp),
                        data = X.clust, D = D)

# Signifance was due to the grouping variable
summary(mixed.res)

[Package MDMR version 0.5.1 Index]