simulateLRT {DHARMa} | R Documentation |
Simulated likelihood ratio tests for (generalized) linear mixed models
Description
This function uses the DHARMa model wrappers to generate simulated likelihood ratio tests (LRTs) for (generalized) linear mixed models based on a parametric bootstrap. The motivation for using a simulated LRT rather than a standard ANOVA or AIC for model selection in mixed models is that df for mixed models are not clearly defined, thus standard ANOVA based on Chi2 statistics or AIC are unreliable, in particular for models with large contributions of REs to the likelihood.
Interpretation of the results as in a normal LRT: the null hypothesis is that m0 is correct, the tests checks if the increase in likelihood of m1 is higher than expected, using data simulated from m0
Usage
simulateLRT(m0, m1, n = 250, seed = 123, plot = T,
suppressWarnings = T, saveModels = F, ...)
Arguments
m0 |
Null Model |
m1 |
Alternative Model |
n |
number of simulations |
seed |
random seed |
plot |
whether null distribution should be plotted |
suppressWarnings |
whether to suppress warnings that occur during refitting the models to simulated data. See details for explanations |
saveModels |
Whether to save refitted models |
... |
additional parameters to pass on to the simulate function of the model object. |
Details
The function performs a simulated LRT, which works as follows:
H0: Model 1 is correct
Our test statistic is the log LRT of M1/M2. Empirical value will always be > 1 because in a nested setting, the more complex model cannot have a worse likelihood.
To generate an expected distribution of the test statistic under H0, we simulate new response data under M0, refit M0 and M1 on this data, and calculate the LRs.
Based on this, calculate p-values etc. in the usual way.
About warnings: warnings such as "boundary (singular) fit: see ?isSingular" will likely occur in this function and are not necessarily the sign of a problem. lme4 warns if RE variances are fit to zero. This is desired / likely in this case, however, because we are simulating data with zero RE variances. Therefore, warnings are turned off per default. For diagnostic reasons, you can turn warnings on, and possibly also inspect fitted models via the parameter saveModels to see if there are any other problems in the re-fitted models.
Note
The logic of an LRT assumes that m0 is nested in m1, which guarantees that the L(M1) > L(M0). The function does not explicitly check if models are nested and will work as long as data can be simulated from M0 that can be refit with M) and M1; however, I would strongly advice against using this for non-nested models unless you have a good statistical reason for doing so.
Also, note that LRTs may be unreliable when fit with REML or some other kind of penalized / restricted ML. Therefore, you should fit model with ML for use in this function.
Author(s)
Florian Hartig
Examples
library(DHARMa)
library(lme4)
set.seed(123)
dat <- createData(sampleSize = 200, randomEffectVariance = 1)
m1 = glmer(observedResponse ~ Environment1 + (1|group), data = dat, family = "poisson")
m0 = glm(observedResponse ~ Environment1 , data = dat, family = "poisson")
## Not run:
out = simulateLRT(m0, m1, n = 10)
# LRT produced warnings, can inspect what's going on
out = simulateLRT(m0, m1, saveModels = T, suppressWarnings = T, n = 10)
summary(out$saveModels[[2]]$refittedM1) # RE SD = 0
# Could try changing the optimizer to reduce warnings
## End(Not run)