R: Model predictions (Kriging)

predict.spmodel {spmodel}

R Documentation

Model predictions (Kriging)

Description

Predicted values and intervals based on a fitted model object.

Usage

## S3 method for class 'splm'
predict(
  object,
  newdata,
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  level = 0.95,
  local,
  ...
)

## S3 method for class 'spautor'
predict(
  object,
  newdata,
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  level = 0.95,
  local,
  ...
)

## S3 method for class 'splm_list'
predict(
  object,
  newdata,
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  level = 0.95,
  local,
  ...
)

## S3 method for class 'spautor_list'
predict(
  object,
  newdata,
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  level = 0.95,
  local,
  ...
)

## S3 method for class 'splmRF'
predict(object, newdata, local, ...)

## S3 method for class 'spautorRF'
predict(object, newdata, local, ...)

## S3 method for class 'splmRF_list'
predict(object, newdata, local, ...)

## S3 method for class 'spautorRF_list'
predict(object, newdata, local, ...)

## S3 method for class 'spglm'
predict(
  object,
  newdata,
  type = c("link", "response"),
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  newdata_size,
  level = 0.95,
  local,
  var_correct = TRUE,
  ...
)

## S3 method for class 'spgautor'
predict(
  object,
  newdata,
  type = c("link", "response"),
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  newdata_size,
  level = 0.95,
  local,
  var_correct = TRUE,
  ...
)

## S3 method for class 'spglm_list'
predict(
  object,
  newdata,
  type = c("link", "response"),
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  newdata_size,
  level = 0.95,
  local,
  var_correct = TRUE,
  ...
)

## S3 method for class 'spgautor_list'
predict(
  object,
  newdata,
  type = c("link", "response"),
  se.fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  newdata_size,
  level = 0.95,
  local,
  var_correct = TRUE,
  ...
)

Arguments

`object`	A fitted model object.
`newdata`	A data frame or `sf` object in which to look for variables with which to predict. If a data frame, `newdata` must contain all variables used by `formula(object)` and all variables representing coordinates. If an `sf` object, `newdata` must contain all variables used by `formula(object)` and coordinates are obtained from the geometry of `newdata`. If omitted, missing data from the fitted model object are used.
`se.fit`	A logical indicating if standard errors are returned. The default is `FALSE`.
`interval`	Type of interval calculation. The default is `"none"`. Other options are `"confidence"` (for confidence intervals) and `"prediction"` (for prediction intervals).
`level`	Tolerance/confidence level. The default is `0.95`.
`local`	A optional logical or list controlling the big data approximation. If omitted, `local` is set to `TRUE` or `FALSE` based on the observed data sample size (i.e., sample size of the fitted model object) – if the sample size exceeds 10,000, `local` is set to `TRUE`, otherwise it is set to `FALSE`. This default behavior occurs because main computational burden of the big data approximation depends almost exclusively on the observed data sample size, not the number of predictions desired (which we feel is not intuitive at first glance). If `local` is `FALSE`, no big data approximation is implemented. If a list is provided, the following arguments detail the big data approximation: `method`: The big data approximation method. If `method = "all"`, all observations are used and `size` is ignored. If `method = "distance"`, the `size` data observations closest (in terms of Euclidean distance) to the observation requiring prediction are used. If `method = "covariance"`, the `size` data observations with the highest covariance with the observation requiring prediction are used. If random effects and partition factors are not used in estimation and the spatial covariance function is monotone decreasing, `"distance"` and `"covariance"` are equivalent. The default is `"covariance"`. Only used with models fit using `splm()` or `spglm()`. `size`: The number of data observations to use when `method` is `"distance"` or `"covariance"`. The default is 100. Only used with models fit using `splm()` or `spglm()`. `parallel`: If `TRUE`, parallel processing via the parallel package is automatically used. This can significantly speed up computations even when `method = "all"` (i.e., no big data approximation is used), as predictions are spread out over multiple cores. The default is `FALSE`. `ncores`: If `parallel = TRUE`, the number of cores to parallelize over. The default is the number of available cores on your machine. When `local` is a list, at least one list element must be provided to initialize default arguments for the other list elements. If `local` is `TRUE`, defaults for `local` are chosen such that `local` is transformed into `list(size = 100, method = "covariance", parallel = FALSE)`.
`...`	Other arguments. Only used for models fit using `splmRF()` or `spautorRF()` where `...` indicates other arguments to `ranger::predict.ranger()`.
`type`	The scale (`response` or `link`) of predictions obtained using `spglm()` or `spgautor` objects.
`newdata_size`	The `size` value for each observation in `newdata` used when predicting for the binomial family.
`var_correct`	A logical indicating whether to return the corrected prediction variances when predicting via models fit using `spglm()` or `spgautor()`. The default is `TRUE`.

Details

For splm and spautor objects, the (empirical) best linear unbiased predictions (i.e., Kriging predictions) at each site are returned when interval is "none" or "prediction" alongside standard errors. Prediction intervals are also returned if interval is "prediction". When interval is "confidence", the estimated mean is returned alongside standard errors and confidence intervals for the mean. For splm_list and spautor_list objects, predictions and associated intervals and standard errors are returned for each list element.

For splmRF or spautorRF objects, random forest spatial residual model predictions are computed by combining the random forest prediction with the (empirical) best linear unbiased prediction for the residual. Fox et al. (2020) call this approach random forest regression Kriging. For splmRF_list or spautorRF objects, predictions are returned for each list element.

Value

For splm or spautor objects, if se.fit is FALSE, predict() returns a vector of predictions or a matrix of predictions with column names fit, lwr, and upr if interval is "confidence" or "prediction". If se.fit is TRUE, a list with the following components is returned:

fit: vector or matrix as above
se.fit: standard error of each fit

For splm_list or spautor_list objects, a list that contains relevant quantities for each list element.

For splmRF or spautorRF objects, a vector of predictions. For splmRF_list or spautorRF_list objects, a list that contains relevant quantities for each list element.

References

Fox, E.W., Ver Hoef, J. M., & Olsen, A. R. (2020). Comparing spatial regression to random forests for large environmental data sets. PloS one, 15(3), e0229509.

Examples

spmod <- splm(sulfate ~ 1,
  data = sulfate,
  spcov_type = "exponential", xcoord = x, ycoord = y
)
predict(spmod, sulfate_preds)
predict(spmod, sulfate_preds, interval = "prediction")
augment(spmod, newdata = sulfate_preds, interval = "prediction")

sulfate$var <- rnorm(NROW(sulfate)) # add noise variable
sulfate_preds$var <- rnorm(NROW(sulfate_preds)) # add noise variable
sprfmod <- splmRF(sulfate ~ var, data = sulfate, spcov_type = "exponential")
predict(sprfmod, sulfate_preds)

[Package spmodel version 0.7.0 Index]