R: Conduct multivariate multiple regression and MANOVA with...

mvlm {MVLM}

R Documentation

Conduct multivariate multiple regression and MANOVA with analytic p-values

Description

mvlm is used to fit linear models with a multivariate outcome. It uses the asymptotic null distribution of the multivariate linear model test statistic to compute p-values (McArtor et al., under review). It therefore alleviates the need to use approximate p-values based Wilks' Lambda, Pillai's Trace, the Hotelling-Lawley Trace, and Roy's Greatest Root.

Usage

mvlm(formula, data, n.cores = 1, start.acc = 1e-20,
  contr.factor = "contr.sum", contr.ordered = "contr.poly")

Arguments

`formula`	An object of class `formula` where the outcome (e.g. the Y in the following formula: Y ~ x1 + x2) is a `n x q matrix`, where `q` is the number of outcome variables being regressed onto the set of predictors included in the formula.
`data`	Mandatory `data.frame` containing all of the predictors passed to `formula`.
`n.cores`	Number of cores to use in parallelization through the `parallel` pacakge.
`start.acc`	Starting accuracy of the Davies (1980) algorithm implemented in the `davies` function in the `CompQuadForm` package (Duchesne & De Micheaux, 2010) that `mvlm` uses to compute multivariate linear model p-values.
`contr.factor`	The type of contrasts used to test unordered categorical variables that have type `factor`. Must be a string taking one of the following values: `("contr.sum", "contr.treatment", "contr.helmert")`.
`contr.ordered`	The type of contrasts used to test ordered categorical variables that have type `ordered`. Must be a string taking one of the following values: `("contr.poly", "contr.sum", "contr.treatment", "contr.helmert")`.

Details

Importantly, the outcome of formula must be a matrix, and the object passed to data must be a data frame containing all of the variables that are named as predictors in formula.

The conditional effects of variables of type factor or ordered in data are computed based on the type of contrasts specified by contr.factor and contr.ordered. If data contains an (ordered or unordered) factor with k levels, a k-1 degree of freedom test will be conducted corresponding to that factor and the specified contrast structure. If, instead, the user wants to assess k-1 separate single DF tests that comprise this omnibus effect (similar to the approach taken by lm), then the appropriate model matrix should be formed in advance and passed to mvlm directly in the data parameter. See the package vigentte for an example by calling vignette('mvlm-vignette').

Value

An object with nine elements and a summary function. Calling summary(mvlm.res) produces a data frame comprised of:

`Statistic`	Value of the corresponding test statistic.
`Numer DF`	Numerator degrees of freedom for each test statistic.
`Pseudo R2`	Size of the corresponding (omnibus or conditional) effect on the multivariate outcome. Note that the intercept term does not have an estimated effect size.
`p-value`	The p-value for each (omnibus or conditional) effect.

In addition to the information in the three columns comprising summary(mvlm.res), the mvlm.res object also contains:

`p.prec`	A data.frame reporting the precision of each p-value. These are the maximum error bound of the p-values reported by the `davies` function in `CompQuadForm`.
`y.rsq`	A matrix containing in its first row the overall variance explained by the model for variable comprising Y (columns). The remaining rows list the variance of each outcome that is explained by the conditional effect of each predictor.
`beta.hat`	Estimated regression coefficients.
`adj.n`	Adjusted sample size used to determine whether or not the asmptotic properties of the model are likely to hold. See McArtor et al. (under review) for more detail.
`data`	Original input data and the `model.matrix` used to fit the model.
`formula`	The formula passed to `mvlm`.

Note that the printed output of summary(res) will truncate p-values to the smallest trustworthy values, but the object returned by summary(mvlm.res) will contain the p-values as computed. If the error bound of the Davies algorithm is larger than the p-value, the only conclusion that can be drawn with certainty is that the p-value is smaller than (or equal to) the error bound.

Author(s)

Daniel B. McArtor (dmcartor@nd.edu) [aut, cre]

References

Davies, R. B. (1980). The Distribution of a Linear Combination of chi-square Random Variables. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(3), 323-333.

Duchesne, P., & De Micheaux, P.L. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4), 858-862.

McArtor, D. B., Grasman, R. P. P. P., Lubke, G. H., & Bergeman, C. S. (under review). A new approach to conducting linear model hypothesis tests with a multivariate outcome.

Examples

data(mvlmdata)

Y <- as.matrix(Y.mvlm)

# Main effects model
mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm)
summary(mvlm.res)

# Include two-way interactions
mvlm.res.int <- mvlm(Y ~ .^2, data = X.mvlm)
summary(mvlm.res.int)

[Package MVLM version 0.1.4 Index]