gdm {gdm}R Documentation

Fit a Generalized Dissimilarity Model to Tabular Site-Pair Data

Description

The gdm function is used to fit a generalized dissimilarity model to tabular site-pair data formatted as follows using the formatsitepair function: distance, weights, s1.xCoord, s1.yCoord, s2.xCoord, s2.yCoord, s1.Pred1, s1.Pred2, ...,s1.PredN, s2.Pred1, s2.Pred2, ..., s2.PredN. The distance column contains the response variable must be any ratio-based dissimilarity (distance) measure between Site 1 and Site 2. The weights column defines any weighting to be applied during fitting of the model. If equal weighting is required, then all entries in this column should be set to 1.0 (default). The third and fourth columns, s1.xCoord and s1.yCoord, represent the spatial coordinates of the first site in the site pair (s1). The fifth and sixth columns, s2.xCoord and s2.yCoord, represent the coordinates of the second site (s2). Note that the first six columns are REQUIRED, even if you do not intend to use geographic distance as a predictor (in which case these columns can be loaded with dummy data if the actual coordinates are unknown - though that would be weird, no?). The next N*2 columns contain values for N predictors for Site 1, followed by values for the same N predictors for Site 2.

The following is an example of a GDM input table header with three environmental predictors (Temp, Rain, Bedrock):

distance, weights, s1.xCoord, s1.yCoord, s2.xCoord, s2.yCoord, s1.Temp, s1.Rain, s1.Bedrock, s2.Temp, s2.Rain, s2.Bedrock

Usage

gdm(data, geo=FALSE, splines=NULL, knots=NULL)

Arguments

data

A data frame containing the site pairs to be used to fit the GDM (obtained using the formatsitepair function). The observed response data must be located in the first column. The weights to be applied to each site pair must be located in the second column. If geo is TRUE, then the s1.xCoord, s1.yCoord and s2.xCoord, s2.yCoord columns will be used to calculate the geographic distance between site pairs for inclusion as the geographic predictor term in the model. Site coordinates ideally should be in a projected coordinate system (i.e., not longitude-latitude) to ensure proper calculation of geographic distances. If geo is FALSE (default), then the s1.xCoord, s1.yCoord, s2.xCoord and s2.yCoord data columns must still be included, but are ignored in fitting the model. Columns containing the predictor data for Site 1, and the predictor data for Site 2, follow.

geo

Set to TRUE if geographic distance between sites is to be included as a model term. Set to FALSE if geographic distance is to be omitted from the model. Default is FALSE.

splines

An optional vector of the number of I-spline basis functions to be used for each predictor in fitting the model. If supplied, it must have the same length as the number of predictors (including geographic distance if geo is TRUE). If this vector is not provided (splines=NULL), then a default of 3 basis functions is used for all predictors.

knots

An optional vector of knots in units of the predictor variables to be used in the fitting process. If knots are supplied and splines=NULL, then the knots argument must have the same length as the number of predictors * n, where n is the number of knots (default=3). If both knots and the number of splines are supplied, then the length of the knots argument must be the same as the sum of the values in the splines vector. Note that the default values for knots when the default three I-spline basis functions are 0 (minimum), 50 (median), and 100 (maximum) quantiles.

Value

gdm returns a gdm model object. The function summary.gdm can be used to obtain or print a synopsis of the results. A gdm model object is a list containing at least the following components:

dataname

The name of the table used as the data argument to the model.

geo

Whether geographic distance was used as a predictor in the model.

gdmdeviance

The deviance of the fitted GDM model.

nulldeviance

The deviance of the null model.

explained

The percentage of null deviance explained by the fitted GDM model.

intercept

The fitted value for the intercept term in the model.

predictors

A list of the names of the predictors that were used to fit the model, in order of the amount of turnover associated with each predictor (based on the sum of the I-spline coefficients).

coefficients

A list of the coefficients for each spline for each of the predictors considered in model fitting.

knots

A vector of the knots derived from the x data (or user defined), for each predictor.

splines

A vector of the number of I-spline basis functions used for each predictor.

creationdate

The date and time of model creation.

observed

The observed response for each site pair (from data column 1).

predicted

The predicted response for each site pair, from the fitted model (after applying the link function).

ecological

The linear predictor (ecological distance) for each site pair, from the fitted model (before applying the link function).

References

Ferrier S, Manion G, Elith J, Richardson, K (2007) Using generalized dissimilarity modelling to analyse and predict patterns of beta diversity in regional biodiversity assessment. Diversity & Distributions 13, 252-264.

See Also

formatsitepair, summary.gdm, plot.gdm, predict.gdm, gdm.transform

Examples

 ##fit table environmental data
 # format site-pair table using the southwest data table
 head(southwest)
 sppData <- southwest[c(1,2,13,14)]
 envTab <- southwest[c(2:ncol(southwest))]

 sitePairTab <- formatsitepair(sppData, 2, XColumn="Long", YColumn="Lat", sppColumn="species",
                               siteColumn="site", predData=envTab)

 ##fit table GDM
 gdmTabMod <- gdm(sitePairTab, geo=TRUE)
 summary(gdmTabMod)

 ##fit raster environmental data
 ##sets up site-pair table
 rastFile <- system.file("./extdata/swBioclims.grd", package="gdm")
 envRast <- raster::stack(rastFile)

 ##environmental raster data
 sitePairRast <- formatsitepair(sppData, 2, XColumn="Long",
                                YColumn="Lat", sppColumn="species",
                                siteColumn="site", predData=envRast)
 ##sometimes raster data returns NA in the site-pair table, these rows will
 ##have to be removed before fitting gdm
 sitePairRast <- na.omit(sitePairRast)

 ##fit raster GDM
 gdmRastMod <- gdm(sitePairRast, geo=TRUE)
 summary(gdmRastMod)


[Package gdm version 1.5.0-9.1 Index]