autoDI {DImodels} | R Documentation |
This function provides an automated way to fit a (limited) range of Diversity-Interactions (DI) models. Using one of several selection criteria, autoDI
will identify the best DI model from the range fitted via a three-step selection process (see Details for more information). While autoDI
can be a useful starting point for fitting DI models, its range of models is not exhaustive and additional model fitting or testing via DI
is likely to be required. For instance, autoDI
does not test for interactions of a treatment with other variables in the model.
autoDI(y, block, density, prop, treat, FG = NULL, data,
selection = c("Ftest","AIC","AICc","BIC","BICc"),
step0 = FALSE, step4 = TRUE)
y |
The column name of the response vector, which must be in quotes, for example, |
block |
The name of the block variable (if present), which must be in quotes, for example, |
density |
The name of the density variable (if present), which must be in quotes, for example, |
prop |
A vector of s column names identifying the species proportions in each row in the dataset. For example, if the species proportions columns are labelled p1 to p4, then |
treat |
The name of a column in the dataset containing the value of a treatment factor or covariate. The treatment name must be included in quotes, for example, |
FG |
If species are classified by g functional groups, this parameter gives a text list (of length s) of the functional group to which each species belongs. For example, for four grassland species with two grasses and two legumes: FG could be |
data |
Specify the dataset, for example, |
selection |
Selection method to be used in the automated model selection process. Options are |
step0 |
By default, |
step4 |
Step 4 performs a lack of fit test for the final model selected by steps 1 - 3. By default, |
What are Diversity-Interactions models?
Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We recommend that users of the DImodels
package read the short introduction to DI models (available at: DImodels
). Further information on DI models is available in Kirwan et al 2009 and Connolly et al 2013.
Checks on data prior to using autoDI
.
Before using autoDI
, check that the species proportions for each row in your dataset sum to one. See the 'Examples' section for code to do this. An error message will be generated if the proportions don't sum to one.
How does the autoDI
function work?
The autoDI
function identifies the 'best' Diversity-Interactions (DI) model from a specific range of proposed models using a three-step process of selection (Steps 1 to 3) and performs a lack of fit test on the selected model (Step 4). Only a limited subset of all possible models are examined under autoDI, for example, interactions involving experiment structural terms (block
, density
, treat
) are not explored. The autoDI
function can provide an excellent initial analysis, but often additional modelling and exploration will be required using the DI
function.
Steps 1-3 outlined below provide details on the automated model selection process followed by autoDI
. Step 4 is a lack of fit test for the selected model. Step 0 may also be included (step0 = TRUE
) as an initial step to sequentially test the inclusion of experiment structural variables (block
, density
, treat
), prior to fitting the DI models in Step 1.
All models in autoDI
Steps 1 and 2 are estimated using iteratively reweighted least squares, via the glm package. Profile likelihood estimation is used in Step 3 to estimate the parameter \theta
.
The default selection method used is selection = "Ftest"
, which will return the appropriate F test statistic value(s) and p-value(s) in each step. When any of the information theoretic approaches ("AIC"
, "AICc"
, "BIC"
or "BICc"
) are specified, the model with the lowest value is selected in each step, even if it is only the lowest by a tiny margin; therefore, it is recommended to examine the information theoretic values across all models.
Step 1
Five models are fitted, sequentially, each building on the previous model, and compared. If the experiment structural variables treat
, block
or density
are specified, they will be included as an additive factor or covariate in each of the five models, but interactions between them and the DI model terms will not be included or tested.
Assume that FG
(functional groups) have been specified. The five DI models are:
STR
: This model contains an intercept, and block, density and treatment if present; STR stand for 'structural' and represents experiment structural variables.
ID
: The species proportions are added to the STR model. The proportions sum to 1, and are included in the model as 0 + p1 + ... + ps
, where s is the number of species in the pool, as specified in the prop
option.
AV
: The terms in the ID model, plus a single 'average' pairwise interaction term. For the Switzerland
dataset, the single variable that is added to the model is computed as: AV = p1*p2 + p1*p3 + p1*p4 + p2*p3 + p2*p4 + p3*p4
.
FG
: The terms in the ID model, plus interaction terms related to functional groups. These terms describe the average effects of interaction between pairs of species within each functional group, and between pairs of species from different functional groups. For example, in the Switzerland
dataset there are four species with FG = c("Grass", "Grass", "Legume", "Legume")
, and there are six pairwise interactions, one between the two grasses, one between the two legumes, and four between grass and legume species. Grouping these interactions gives three terms, one for interactions between grasses, one for interactions between legumes and one for between a grass and a legume species. The model assumes that any grass will interaction with any legume in the same way and the 'between functional group grass-legume' variable is computed as: bfg_G_L = p1*p3 + p1*p4 + p2*p3 + p2*p4
. If there were more than two grasses in the dataset, this model would assume that any pair of grasses interact in the same way, similarly for legumes. If the FG
argument is not specified, this model is omitted from Step 1.
FULL
: The terms in the ID model, plus an interaction term for each pair of species. When there are s species, there are s*(s-1)/2 pairwise interaction terms, i.e., for the Switzerland
dataset with four species, there are six interactions that are each added to the model (in R formula syntax: p1:p2
, p1:p3
, p1:p4
, p2:p3
, p2:p4
and p3:p4
).
If the FG
argument is omitted, the FG
model will be replaced by:
ADD
: The terms in the ID model, plus a species specific 'additive' interaction term for each species. These terms measure the interactive contribution of each species with any other species and are denoted \lambda_i
for the ith species. The interaction between any pair of species i and j is computed as \lambda_i + \lambda_j
.
Step 2
If treat
is specified, the selected model in Step 1 includes the treat
variable (since all models in Step 1 include treat
if present). Here, the selected model from Step 1 is re-fitted without treat
and the models with and without the treatment are compared using the method specified by the selection
argument.
If treat
was not specified, this step is redundant.
Step 3
For the selected model in Step 2 (or Step 1 if no treat
was specified), a non-linear parameter theta
is included as a power on all pi*pj
components of each interaction variable in the model, and the models with and without theta
are compared using the method specified by the selection
argument. For example, for the Switzerland
data, with four species, for the AV model, the parameter theta
enters the model as:
AV_theta = (p1*p2)^theta + (p1*p3)^theta + (p1*p4)^theta + (p2*p3)^theta + (p2*p4)^theta + (p3*p4)^theta
In Steps 1 and 2, theta
is implicitly assumed to equal 1 (since, for example, (p1*p2)^1 = p1*p2), while in Step 3, theta is estimated using profile likelihood and then assessed for a difference from 1. In the profile likelihood estimation, theta
is tested across the range 0.01 to 1.5. Then, the two models (with theta
estimated and with theta
set to 1) are compared using the method specified by the selection
argument.
If there are no species interaction terms included in the best model selected in Step 1 or Step 2, this step is redundant. I.e., if there is no evidence of species interactions, then there is no need to test for theta
.
Details on DI models that include theta
are described in Connolly et al 2013.
Step 4
Step 4 provides a lack of fit test for the model selected by Steps 1 to 3. A factor called 'community' is created that has a level for each unique setting of the species proportions (as specified in the prop
argument). The 'reference' model includes all terms in the model that was selected by Steps 1 to 3, plus the factor community. The reference model is compared to the model selected by Steps 1 to 3 via an F-test (regardless of the selection
argument value), thus providing a lack of fit test.
Note, that the reference model is never intended to be a candidate model, it is only fitted for the purpose of testing lack of fit. If the test result is significant, it indicates that there is a lack of fit in the Diversity-Interactions model selected by autoDI
.
Note also, that the test will not work if all combinations of the species proportions are unique. In this instance, the option Step4 = FALSE
should be selected.
The function returns a list including
model_list |
a list with all fitted models without any treatment effects, with |
model_list_theta |
a list with all fitted models without any treatment effects, with |
model_list_treat |
a list with all fitted models including treatment effects, with |
model_list_theta_treat |
a list with all fitted models including treatment effects, with |
logLik |
a vector with the log-likelihood values for the fitted models |
AIC |
a vector with the AIC values for the fitted models |
AICc |
a vector with the AICc values for the fitted models |
BIC |
a vector with the BIC values for the fitted models |
BICc |
a vector with the BICc values for the fitted models |
df |
a vector with the residual d.f. values for the fitted models |
selected_model_code |
the model tag for the selected model |
selected_model |
the description of the selected model |
selected_model_obj |
the selected model object |
data |
the dataset including all manipulated variables |
family |
the |
Rafael A. Moral, John Connolly and Caroline Brophy
Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.
Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.
Other examples using autoDI
:
The sim1
dataset examples.
The sim2
dataset examples.
The sim3
dataset examples.
The sim4
dataset examples.
The sim5
dataset examples.
The Switzerland
dataset examples.
## Load the Switzerland data
data(Switzerland)
## Summarise the Switzerland data
summary(Switzerland)
## Check that the proportions sum to 1 (required for DI models)
## p1 to p4 are in the 4th to 7th columns in Switzerland
Switzerlandsums <- rowSums(Switzerland[4:7])
summary(Switzerlandsums)
## Perform automated model fitting on the Switzerland dataset
## Model selection by F-test
auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"),
treat = "nitrogen", FG = c("G", "G", "L", "L"), data = Switzerland,
selection = "Ftest")
summary(auto1)
## Using column numbers to indicate which columns contain the proportions and including Step 0
auto2 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen",
FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest", step0 = TRUE)
summary(auto2)
## Exclude the FG (functional group) argument to include the additive species "ADD" model in Step 1
auto3 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen",
data = Switzerland, selection = "Ftest")
summary(auto3)