autoDI {DImodels} R Documentation

## Automated Diversity-Interactions Model Fitting

### Description

This function provides an automated way to fit a (limited) range of Diversity-Interactions (DI) models. Using one of several selection criteria, autoDI will identify the best DI model from the range fitted via a three-step selection process (see Details for more information). While autoDI can be a useful starting point for fitting DI models, its range of models is not exhaustive and additional model fitting or testing via DI is likely to be required. For instance, autoDI does not test for interactions of a treatment with other variables in the model.

### Usage

autoDI(y, block, density, prop, treat, FG = NULL, data,
selection = c("Ftest","AIC","AICc","BIC","BICc"),
step0 = FALSE, step4 = TRUE)


### Arguments

 y The column name of the response vector, which must be in quotes, for example, y = "yield". block The name of the block variable (if present), which must be in quotes, for example, block = "block". If no blocking variable, omit this argument. density The name of the density variable (if present), which must be in quotes, for example, density = "density". If no density variable, omit this argument. prop A vector of s column names identifying the species proportions in each row in the dataset. For example, if the species proportions columns are labelled p1 to p4, then prop = c("p1","p2","p3","p4"). Alternatively, the column numbers can be specified, for example, prop = 4:7, where the species proportions are in the 4th to 7th columns. treat The name of a column in the dataset containing the value of a treatment factor or covariate. The treatment name must be included in quotes, for example, treat = "nitrogen". (Only one treatment or covariate is permitted in autoDI, but see DI for options involving more than one treatment.) If the treatment is a factor, the variable must already be specified as a factor prior to using autoDI. FG If species are classified by g functional groups, this parameter gives a text list (of length s) of the functional group to which each species belongs. For example, for four grassland species with two grasses and two legumes: FG could be FG = c("G","G","L","L"), where G stands for grass and L stands for legume. data Specify the dataset, for example, data = Switzerland. The dataset name should not appear in quotes. selection Selection method to be used in the automated model selection process. Options are "Ftest", "AIC", "AICc", "BIC" and "BICc". The default is selection = "Ftest". step0 By default, autoDI outputs steps 1 - 4, however, an initial step 0 can be included by specifying step0 = TRUE. This will fit a model with an intercept only, and will sequentially add in and test the inclusion of block, density and treat, if they are specified in the autoDI arguments. step4 Step 4 performs a lack of fit test for the final model selected by steps 1 - 3. By default, step4 = TRUE, but it can be omitted by specifying step4 = FALSE.

### Details

What are Diversity-Interactions models?

Diversity-Interactions (DI) models (Kirwan et al 2009) are a set of tools for analysing and interpreting data from experiments that explore the effects of species diversity on community-level responses. We recommend that users of the DImodels package read the short introduction to DI models (available at: DImodels). Further information on DI models is available in Kirwan et al 2009 and Connolly et al 2013.

Checks on data prior to using autoDI.

Before using autoDI, check that the species proportions for each row in your dataset sum to one. See the 'Examples' section for code to do this. An error message will be generated if the proportions don't sum to one.

How does the autoDI function work?

The autoDI function identifies the 'best' Diversity-Interactions (DI) model from a specific range of proposed models using a three-step process of selection (Steps 1 to 3) and performs a lack of fit test on the selected model (Step 4). Only a limited subset of all possible models are examined under autoDI, for example, interactions involving experiment structural terms (block, density, treat) are not explored. The autoDI function can provide an excellent initial analysis, but often additional modelling and exploration will be required using the DI function.

Steps 1-3 outlined below provide details on the automated model selection process followed by autoDI. Step 4 is a lack of fit test for the selected model. Step 0 may also be included (step0 = TRUE) as an initial step to sequentially test the inclusion of experiment structural variables (block, density, treat), prior to fitting the DI models in Step 1.

All models in autoDI Steps 1 and 2 are estimated using iteratively reweighted least squares, via the glm package. Profile likelihood estimation is used in Step 3 to estimate the parameter \theta.

The default selection method used is selection = "Ftest", which will return the appropriate F test statistic value(s) and p-value(s) in each step. When any of the information theoretic approaches ("AIC", "AICc", "BIC" or "BICc") are specified, the model with the lowest value is selected in each step, even if it is only the lowest by a tiny margin; therefore, it is recommended to examine the information theoretic values across all models.

Step 1

Five models are fitted, sequentially, each building on the previous model, and compared. If the experiment structural variables treat, block or density are specified, they will be included as an additive factor or covariate in each of the five models, but interactions between them and the DI model terms will not be included or tested.

Assume that FG (functional groups) have been specified. The five DI models are:

• STR: This model contains an intercept, and block, density and treatment if present; STR stand for 'structural' and represents experiment structural variables.

• ID: The species proportions are added to the STR model. The proportions sum to 1, and are included in the model as 0 + p1 + ... + ps, where s is the number of species in the pool, as specified in the prop option.

• AV: The terms in the ID model, plus a single 'average' pairwise interaction term. For the Switzerland dataset, the single variable that is added to the model is computed as: AV = p1*p2 + p1*p3 + p1*p4 + p2*p3 + p2*p4 + p3*p4.

• FG: The terms in the ID model, plus interaction terms related to functional groups. These terms describe the average effects of interaction between pairs of species within each functional group, and between pairs of species from different functional groups. For example, in the Switzerland dataset there are four species with FG = c("Grass", "Grass", "Legume", "Legume"), and there are six pairwise interactions, one between the two grasses, one between the two legumes, and four between grass and legume species. Grouping these interactions gives three terms, one for interactions between grasses, one for interactions between legumes and one for between a grass and a legume species. The model assumes that any grass will interaction with any legume in the same way and the 'between functional group grass-legume' variable is computed as: bfg_G_L = p1*p3 + p1*p4 + p2*p3 + p2*p4. If there were more than two grasses in the dataset, this model would assume that any pair of grasses interact in the same way, similarly for legumes. If the FG argument is not specified, this model is omitted from Step 1.

• FULL: The terms in the ID model, plus an interaction term for each pair of species. When there are s species, there are s*(s-1)/2 pairwise interaction terms, i.e., for the Switzerland dataset with four species, there are six interactions that are each added to the model (in R formula syntax: p1:p2, p1:p3, p1:p4, p2:p3, p2:p4 and p3:p4).

If the FG argument is omitted, the FG model will be replaced by:

• ADD: The terms in the ID model, plus a species specific 'additive' interaction term for each species. These terms measure the interactive contribution of each species with any other species and are denoted \lambda_i for the ith species. The interaction between any pair of species i and j is computed as \lambda_i + \lambda_j.

Step 2

If treat is specified, the selected model in Step 1 includes the treat variable (since all models in Step 1 include treat if present). Here, the selected model from Step 1 is re-fitted without treat and the models with and without the treatment are compared using the method specified by the selection argument.

If treat was not specified, this step is redundant.

Step 3

For the selected model in Step 2 (or Step 1 if no treat was specified), a non-linear parameter theta is included as a power on all pi*pj components of each interaction variable in the model, and the models with and without theta are compared using the method specified by the selection argument. For example, for the Switzerland data, with four species, for the AV model, the parameter theta enters the model as:

AV_theta = (p1*p2)^theta + (p1*p3)^theta + (p1*p4)^theta + (p2*p3)^theta + (p2*p4)^theta + (p3*p4)^theta

In Steps 1 and 2, theta is implicitly assumed to equal 1 (since, for example, (p1*p2)^1 = p1*p2), while in Step 3, theta is estimated using profile likelihood and then assessed for a difference from 1. In the profile likelihood estimation, theta is tested across the range 0.01 to 1.5. Then, the two models (with theta estimated and with theta set to 1) are compared using the method specified by the selection argument.

If there are no species interaction terms included in the best model selected in Step 1 or Step 2, this step is redundant. I.e., if there is no evidence of species interactions, then there is no need to test for theta.

Details on DI models that include theta are described in Connolly et al 2013.

Step 4

Step 4 provides a lack of fit test for the model selected by Steps 1 to 3. A factor called 'community' is created that has a level for each unique setting of the species proportions (as specified in the prop argument). The 'reference' model includes all terms in the model that was selected by Steps 1 to 3, plus the factor community. The reference model is compared to the model selected by Steps 1 to 3 via an F-test (regardless of the selection argument value), thus providing a lack of fit test.

Note, that the reference model is never intended to be a candidate model, it is only fitted for the purpose of testing lack of fit. If the test result is significant, it indicates that there is a lack of fit in the Diversity-Interactions model selected by autoDI.

Note also, that the test will not work if all combinations of the species proportions are unique. In this instance, the option Step4 = FALSE should be selected.

### Value

The function returns a list including

 model_list a list with all fitted models without any treatment effects, with estimate_theta = FALSE model_list_theta a list with all fitted models without any treatment effects, with estimate_theta = TRUE model_list_treat a list with all fitted models including treatment effects, with estimate_theta = FALSE model_list_theta_treat a list with all fitted models including treatment effects, with estimate_theta = TRUE logLik a vector with the log-likelihood values for the fitted models AIC a vector with the AIC values for the fitted models AICc a vector with the AICc values for the fitted models BIC a vector with the BIC values for the fitted models BICc a vector with the BICc values for the fitted models df a vector with the residual d.f. values for the fitted models selected_model_code the model tag for the selected model selected_model the description of the selected model selected_model_obj the selected model object data the dataset including all manipulated variables family the family passed to glm

### Author(s)

Rafael A. Moral, John Connolly and Caroline Brophy

### References

Connolly J, T Bell, T Bolger, C Brophy, T Carnus, JA Finn, L Kirwan, F Isbell, J Levine, A Lüscher, V Picasso, C Roscher, MT Sebastia, M Suter and A Weigelt (2013) An improved model to predict the effects of changing biodiversity levels on ecosystem function. Journal of Ecology, 101, 344-355.

Kirwan L, J Connolly, JA Finn, C Brophy, A Lüscher, D Nyfeler and MT Sebastia (2009) Diversity-interaction modelling - estimating contributions of species identities and interactions to ecosystem function. Ecology, 90, 2032-2038.

DI

Other examples using autoDI: The sim1 dataset examples. The sim2 dataset examples. The sim3 dataset examples. The sim4 dataset examples. The sim5 dataset examples. The Switzerland dataset examples.

### Examples


data(Switzerland)
## Summarise the Switzerland data
summary(Switzerland)

## Check that the proportions sum to 1 (required for DI models)
## p1 to p4 are in the 4th to 7th columns in Switzerland
Switzerlandsums <- rowSums(Switzerland[4:7])
summary(Switzerlandsums)

## Perform automated model fitting on the Switzerland dataset

## Model selection by F-test
auto1 <- autoDI(y = "yield", density = "density", prop = c("p1","p2","p3","p4"),
treat = "nitrogen", FG = c("G", "G", "L", "L"), data = Switzerland,
selection = "Ftest")
summary(auto1)

## Using column numbers to indicate which columns contain the proportions and including Step 0
auto2 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen",
FG = c("G", "G", "L", "L"), data = Switzerland, selection = "Ftest", step0 = TRUE)
summary(auto2)

## Exclude the FG (functional group) argument to include the additive species "ADD" model in Step 1
auto3 <- autoDI(y = "yield", density = "density", prop = 4:7, treat = "nitrogen",
data = Switzerland, selection = "Ftest")
summary(auto3)



[Package DImodels version 1.1 Index]