rpt {rptR} | R Documentation |
Repeatability Estimation for Gaussian and Non-Gaussian Data
Description
A wrapper function for (adjusted) repeatability estimation from generalized linear mixed-effects models fitted by restricted maximum likelihood (REML). Calls specialised functions depending of the choice of datatype and method.
Usage
rpt(formula, grname, data, datatype = c("Gaussian", "Binary",
"Proportion", "Poisson"), link = c("logit", "probit", "log", "sqrt"),
CI = 0.95, nboot = 1000, npermut = 0, parallel = FALSE,
ncores = NULL, ratio = TRUE, adjusted = TRUE, expect = "meanobs",
rptObj = NULL, update = FALSE)
Arguments
formula |
Formula as used e.g. by lmer. The grouping factor(s) of interest needs to be included as a random effect, e.g. '(1|groups)'. Covariates and additional random effects can be included to estimate adjusted repeatabilities. |
grname |
A character string or vector of character strings giving the
name(s) of the grouping factor(s), for which the repeatability should
be estimated. Spelling needs to match the random effect names as given in |
data |
A dataframe that contains the variables included in the |
datatype |
Character string specifying the data type ('Gaussian', 'Binary', 'Proportion', 'Poisson'). |
link |
Character string specifying the link function. Ignored for 'Gaussian' datatype. |
CI |
Width of the required confidence interval between 0 and 1 (defaults to 0.95). |
nboot |
Number of parametric bootstraps for interval estimation
(defaults to 1000). Larger numbers of bootstraps give a better
asymtotic CI, but may be time-consuming. Bootstrapping can be switch off by setting
|
npermut |
Number of permutations used when calculating asymptotic p-values
(defaults to 0). Larger numbers of permutations give a better
asymtotic p-values, but may be time-consuming (in particular when multiple grouping factors
are specified). Permutaton tests can be switch off by setting |
parallel |
Boolean to express if parallel computing should be applied (defaults to FALSE). If TRUE, bootstraps and permutations will be distributed across multiple cores. |
ncores |
Specifying the number of cores to use for parallelization. On default, all but one of the available cores are used. |
ratio |
Boolean to express if variances or ratios of variance should be estimated. If FALSE, the variance(s) are returned without forming ratios. If TRUE (the default) ratios of variances (i.e. repeatabilities) are estimated. |
adjusted |
Boolean to express if adjusted or unadjusted repeatabilities should be estimated. If TRUE (the default), the variances explained by fixed effects (if any) will not be part of the denominator, i.e. repeatabilities are calculated after controlling for variation due to covariates. If FALSE, the varianced explained by fixed effects (if any) will be added to the denominator. |
expect |
A character string specifying the method for estimating the expectation in Poisson models with log link and in Binomial models with logit link (in all other cases the agrument is ignored). The only valid terms are 'meanobs' and 'latent' (and 'liability for binary and proportion data). With the default 'meanobs', the expectation is estimated as the mean of the observations in the sample. With 'latent', the expectation is estimated from estiamtes of the intercept and variances on the link scale. While this is a preferred solution, it is susceptible to the distribution of fixed effect covariates and gives appropriate results typically only when all covariances are centered to zero. With 'liability' estimates follow formulae as presented in Nakagawa & Schielzeth (2010). Liability estimates tend to be slightly higher. |
rptObj |
The output of a rptR function. Can be specified in combination with update = TRUE to update bootstraps and permutations |
update |
If TRUE, the rpt object to be updated has to be inputted with the rptObj argument. The function just updates the permutations and bootstraps, so make sure to specify all other arguments excactly like for the rpt object specified in rptObj. |
Details
For datatype='Gaussian'
calls function rptGaussian,
for datatype='Poisson'
calls function rptPoisson,
for datatype='Binary'
calls function rptBinary,
for datatype='Proportion'
calls function rptProportion.
Confidence intervals and standard errors are estimated by parametric bootstrapping.
Under the assumption that the model is specified correctly, the fitted model can be used
to generate response values that could potentially be obversed. Differences between the original
data and the simulated response from the fitted model arise from sampling variation. The full model
is then fitted to each simuated response vector. The distribution of estimates across all
nboot
replicates represents the design- and model-specific sampling variance and hence
uncertainty of the estimates.
In addition to the likelihood-ratio test, the package uses permutation tests for null
hypothesis testing. The general idea is to randomize data under the null hypothesis of no effect
and then test in how many cases the estimates from the model reach or exceed those in the observed
data. In the simplest case, a permutation test randomizes the vector of group identities against
the response vector many times, followed by refitting the model and recalculating the repeatabilities.
This provides a null distribution for the case that group identities are unrelated to the response.
However, in more complex models involving multiple random effects and/or fixed effects, such a
procedure will also break the data structure between the grouping factor of interest and other
aspects of the experimental design. Therefore rptR
implements a more robust alternative
which works by fitting a model withouth the grouping factor of interest. It then adds the
randomized residuals to the fitted values of this model, followed by recalculating the repeatability
from the full model. This procedure maintains the general data structure and any effects other
than the grouping effect of interest. The number of permutations can be adjusted with the nperm
argument.
By the logic of a null hypothsis testing, the observed data is one possible (albeit maybe unlikely)
outcome under the null hypothesis. So the observed data is always included as one 'randomization' and
the P value can thus never be lower than 1/nperm
, because at least one randomization is as
exteme as the observed data.
Note also that the likelihood-ratio test, since testing variances at the boundary of the possible parameter range (i.e. against zero), uses a mixture distribution of Chi-square distrbutions with zero and one degree of freedom as a reference. This ist equivalent to deviding the P value derived from a Chi-square distribution with one degree of freedom by two.
Value
Returns an object of class rpt
. See specific functions for details.
Author(s)
Holger Schielzeth (holger.schielzeth@uni-jena.de), Shinichi Nakagawa (s.nakagawa@unsw.edu.au), Martin Stoffel (martin.adam.stoffel@gmail.com)
References
Nakagawa, S. & Schielzeth, H. (2010) Repeatability for Gaussian and non-Gaussian data: a practical guide for biologists. Biological Reviews 85: 935-956.
See Also
Examples
# load data
data(BeetlesBody)
data(BeetlesMale)
data(BeetlesFemale)
# prepare proportion data
BeetlesMale$Dark <- BeetlesMale$Colour
BeetlesMale$Reddish <- (BeetlesMale$Colour-1)*-1
BeetlesColour <- aggregate(cbind(Dark, Reddish) ~ Treatment + Population + Container,
data=BeetlesMale, FUN=sum)
# Note: nboot and npermut are set to 0 for speed reasons. Use larger numbers
# for the real analysis.
# gaussian data (example with a single random effect)
rpt(BodyL ~ (1|Population), grname="Population", data=BeetlesBody,
nboot=0, npermut=0, datatype = "Gaussian")
# poisson data (example with two grouping levels and adjusted for fixed effect)
rpt(Egg ~ Treatment + (1|Container) + (1|Population), grname=c("Population"),
data = BeetlesFemale, nboot=0, npermut=0, datatype = "Poisson")
## Not run:
# binary data (example with estimation of the fixed effect variance)
rpt(Colour ~ Treatment + (1|Container) + (1|Population),
grname=c("Population", "Container", "Fixed"),
data=BeetlesMale, nboot=0, npermut=0, datatype = "Binary", adjusted = FALSE)
# proportion data (example for the estimation of raw variances,
# including residual and fixed-effect variance)
rpt(cbind(Dark, Reddish) ~ Treatment + (1|Population),
grname=c("Population", "Residual", "Fixed"), data=BeetlesColour,
nboot=0, npermut=0, datatype = "Proportion", ratio=FALSE)
## End(Not run)