brt_fit {dynamicSDM}R Documentation

Fit boosted regression tree models to species distribution or abundance data.


Fit gradient boosting boosted regression tree models to species distribution and abundance data and associated dynamic explanatory variables.


  n.trees = 5000,
  shrinkage = 0.001


a data frame, the data to fit boosted regression tree models to, containing columns for model response and explanatory variable data. If required, should contain block.col and weights.col columns too.


a character string, the name of the column in containing response variable column.


a character vector, the names of the columns containing model explanatory variables in


a character string, the model distribution family to use, such as gaussian, poisson or bernoulli.


optional; a character string, the name of the column in containing spatio-temporal block numbers for splitting. See details for more information.


a character string, the name of the column in containing spatio-temporal sampling effort weights to be used in the model fitting process.

optional; a data frame, the testing dataset for optimising interaction.depth when blocking is not used.


optional; an integer specifying the maximum depth of each tree (i.e. highest level of variable interactions allowed). Default optimises depth between 1 and 4.


optional; an integer, the number of trees in boosted regression tree models. Default is 5000.


optional; an integer, the shrinkage parameter applied to each tree in the boosted regression tree expansion. Also known as the learning rate. Default is 0.001.


This function calculates a gradient boosting gbm object for the response and explanatory variable data provided, using the gbm R package (Greenwell et al., 2019).

Key functionality for dynamic SDMs within brt_fit() includes:

If interaction.depth is not given, then brt_fit() will vary the interaction depth parameter between 1 (an additive model) and 4 (four-way interaction model). For each interaction.depth value, model performance is measured by calculating the root-mean-square error of model predictions compared to actual values in the testing data. The interaction.depth value that results in the lowest root-mean-square error is used when fitting the returned model.

The model testing dataset used can either be given using or block.col (expanded on below).

If block.col is specified, then each unique block is excluded in a jack-knife approach following Bagchi et al., (2013). This approach uses each block as the model testing dataset in numerical order, whilst all other block.col blocks are used as training data for the boosted regression tree model.

In this case, the function returns a list of fitted boosted regression tree models equal to the length of unique blocking categories in block.col.

If block.col is not given, models are fit to all and a single gbm model is returned.

If weights.col is specified, records are weighted by their associated value in this column when model fitting. For instance, the user may wish to down weigh the importance of records collected at oversampled sites and times when fitting models, and vice versa, to account for spatio-temporal biases in occurrence records(Stolar and Nielsen, 2015) .


Returns a gbm model object or list of gbm model objects.


Bagchi, R., Crosby, M., Huntley, B., Hole, D. G., Butchart, S. H. M., Collingham, Y., Kalra, M., Rajkumar, J., Rahmani, A. & Pandey, M. 2013. Evaluating the effectiveness of conservation site networks under climate change: accounting for uncertainty. Global Change Biology, 19, 1236-1248.

Greenwell, B., Boehmke, B., Cunningham, J., & GBM Developers. 2019. Package ‘gbm’. R package version, 2.

Stolar, J. & Nielsen, S. E. 2015. Accounting For Spatially Biased Sampling Effort In Presence-Only Species Distribution Modelling. Diversity And Distributions, 21, 595-608.



split <- sample(c(TRUE, FALSE),
               prob = c(0.75, 0.25))

training <- sample_explan_data[split, ]
testing <- sample_explan_data[!split, ]

brt_fit( = training, = testing,
 response.col = "presence.absence",
 distribution = "bernoulli",
 varnames = colnames(training)[14:16],
 interaction.depth = 2

[Package dynamicSDM version 1.3.4 Index]