| Zeta.msgdm {zetadiv} | R Documentation | 
Multi-site generalised dissimilarity modelling for a set of environmental variables and distances
Description
Computes a regression model of zeta diversity for a given order (number of assemblages or sites) against a set of environmental variables and distances between sites. The different regression models available are generalised linear models, generalised linear models with negative constraints, generalised additive models, shape constrained additive models, and I-splines.
Usage
Zeta.msgdm(
  data.spec,
  data.env,
  xy = NULL,
  data.spec.pred = NULL,
  order = 1,
  sam = 1000,
  reg.type = "glm",
  family = stats::gaussian(),
  method.glm = "glm.fit.cons",
  cons = -1,
  cons.inter = 1,
  confint.level = 0.95,
  bs = "mpd",
  kn = -1,
  order.ispline = 2,
  kn.ispline = 1,
  distance.type = "Euclidean",
  dist.custom = NULL,
  rescale = FALSE,
  rescale.pred = TRUE,
  method = "mean",
  normalize = FALSE,
  silent = FALSE,
  empty.row = 0,
  control = list(),
  glm.init = FALSE
)
Arguments
| data.spec | Site-by-species presence-absence data frame, with sites as rows and species as columns. | 
| data.env | Site-by-variable data frame, with sites as rows and environmental variables as columns. | 
| xy | Site coordinates, to account for distances between sites. | 
| data.spec.pred | Site-by-species presence-absence data frame or list of data frames, with sites as rows and species as columns, for which zeta diversity will be computed and used as a predictor of the zeta diversity of  | 
| order | Specific number of assemblages or sites at which zeta diversity is computed. | 
| sam | Number of samples for which the zeta diversity is computed. | 
| reg.type | Type of regression used in the multi-site generalised dissimilarity modelling. Options are " | 
| family | A description of the error distribution and link function to be used in the  | 
| method.glm | Method used in fitting the generalised linear model. The default method  | 
| cons | type of constraint in the glm if  | 
| cons.inter | type of constraint for the intercept. Default is 1 for positive intercept, suitable for Gaussian family. The other option is -1 for negative intercept, suitable for binomial family. | 
| confint.level | Percentage for the confidence intervals of the coefficients from the generalised linear models. | 
| bs | A two-letter character string indicating the (penalized) smoothing basis to use in the scam model. Default is " | 
| kn | Number of knots in the GAM and SCAM. Default is -1 for determining kn automatically using Generalized Cross-validation. | 
| order.ispline | Order of the I-spline. | 
| kn.ispline | Number of knots in the I-spline. | 
| distance.type | Method to compute distance. Default is " | 
| dist.custom | Distance matrix provided by the user when  | 
| rescale | Boolean value (TRUE or FALSE) indicating if the zeta values should be divided by the total number of species in the dataset, to get a range of values between 0 and 1. Has no effect if  | 
| rescale.pred | Boolean value (TRUE or FALSE) indicating if the spatial distances and differences in environmental variables should be rescaled between 0 and 1. | 
| method | Name of a function (as a string) indicating how to combine the pairwise differences and distances for more than 3 sites. It can be a basic R-function such as " | 
| normalize | Indicates if the zeta values for each sample should be divided by the total number of species for this specific sample ( | 
| silent | Boolean value (TRUE or FALSE) indicating if warnings must be printed. | 
| empty.row | Determines how to handle empty rows, i.e. sites with no species. Such sites can cause underestimations of zeta diversity, and computation errors for the normalized version of zeta due to divisions by 0. Options are " | 
| control | As for  | 
| glm.init | Boolean value, indicating if the initial parameters for fitting the glm with constraint on the coefficients signs for  | 
Details
The environmental variables can be numeric or factorial.
If order = 1, the variables are used as such in the regression, and factorial variables must be dummy for the output of the regression to be interpretable.
For numeric variables, if order>1 the pairwise difference between sites is computed and combined according to method. For factorial variables, the distance corresponds to the number of unique values over the number of assemblages of sites specified by order.
If xy = NULL, Zeta.msgdm only uses environmental variables in the regression. Otherwise, it also computes and uses euclidian distance (average or maximum distance between multiple sites, depending on the parameters method) as an explanatory variable.
If rescale.pred = TRUE, zeta is regressed against the differences of values of the environmental variables divided by the maximum difference for each variable, to be rescaled between 0 and 1. If !is.null(xy), distances between sites are also divided by the maximum distance. If order = 1, the variables are transformed by first subtracting their minimum value, and dividing by the difference of their maximum and minimum values.
If reg.type = "ispline", the variables are rescaled between 0 and 1 prior to computing the I-splines by subtracting their minimum value, and dividing by the difference of their maximum and minimum values.
Value
Zeta.msgdm returns a list whose component vary depending on the regression technique. The list can contain the following components:
| val | Vector of zeta values used in the MS-GDM. | 
| predictors | Data frame of the predictors used in the MS-GDM. | 
| range.min | Vector containing the minimum values of the numeric variables, used for rescaling the variables between 0 and 1 for I-splines (see Details). | 
| range.max | Vector containing the maximum values of the numeric variables, used for rescaling the variables between 0 and 1 for I-splines (see Details). | 
| rescale.factor | Factor by which the predictors were divided if  | 
| order.ispline | The value of the original parameter, to be used in  | 
| kn.ispline | The value of the original parameter, to be used in  | 
| model | An object whose class depends on the type of regression ( | 
| confint | The confidence intervals for the coefficients from generalised linear models with no constraint.  | 
| vif | The variance inflation factors for all the variables for the generalised linear regression.  | 
References
Hui C. & McGeoch M.A. (2014). Zeta diversity as a concept and metric that unifies incidence-based biodiversity patterns. The American Naturalist, 184, 684-694.
Ferrier, S., Manion, G., Elith, J., & Richardson, K. (2007). Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Diversity and Distributions, 13(3), 252-264.
See Also
Zeta.decline.mc, Zeta.order.mc, Zeta.decline.ex, Zeta.order.ex, Predict.msgdm,
Examples
utils::data(bird.spec.coarse)
xy.bird <- bird.spec.coarse[1:2]
data.spec.bird <- bird.spec.coarse[3:193]
utils::data(bird.env.coarse)
data.env.bird <- bird.env.coarse[,3:9]
zeta.glm <- Zeta.msgdm(data.spec.bird, data.env.bird, sam = 100, order = 3)
zeta.glm
dev.new()
graphics::plot(zeta.glm$model)
zeta.ngls <- Zeta.msgdm(data.spec.bird, data.env.bird, xy.bird, sam = 100, order = 3,
    reg.type = "ngls", rescale = TRUE)
zeta.ngls
##########
utils::data(Marion.species)
xy.marion <- Marion.species[1:2]
data.spec.marion <- Marion.species[3:33]
utils::data(Marion.env)
data.env.marion <- Marion.env[3]
zeta.gam <- Zeta.msgdm(data.spec.marion, data.env.marion, sam = 100, order = 3,
    reg.type = "gam")
zeta.gam
dev.new()
graphics::plot(zeta.gam$model)
zeta.ispline <- Zeta.msgdm(data.spec.marion, data.env.marion, xy.marion, sam = 100,
    order = 3, normalize = "Jaccard", reg.type = "ispline")
zeta.ispline
zeta.ispline.r <- Return.ispline(zeta.ispline, data.env.marion, distance = TRUE)
zeta.ispline.r
dev.new()
Plot.ispline(isplines = zeta.ispline.r, distance = TRUE)
dev.new()
Plot.ispline(msgdm = zeta.ispline, data.env = data.env.marion, distance = TRUE)