Bspatial {bmstdr}R Documentation

Bayesian regression model fitting for point referenced spatial data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.

Description

Bayesian regression model fitting for point referenced spatial data. Calculates parameter estimates, validation statistics, and estimated values of several Bayesian model choice criteria.

Usage

Bspatial(
  formula,
  data,
  package = "none",
  model = "lm",
  coordtype = NULL,
  coords = NULL,
  validrows = NULL,
  scale.transform = "NONE",
  prior.beta0 = 0,
  prior.M = 1e-04,
  prior.sigma2 = c(2, 1),
  prior.tau2 = c(2, 0.1),
  phi = NULL,
  prior.phi.param = NULL,
  prior.range = c(1, 0.5),
  prior.sigma = c(1, 0.005),
  offset = c(10, 140),
  max.edge = c(50, 1000),
  cov.model = "exponential",
  N = 5000,
  burn.in = 1000,
  rseed = 44,
  n.report = 500,
  no.chains = 1,
  ad.delta = 0.99,
  s.size = 0.01,
  t.depth = 15,
  verbose = TRUE,
  plotit = TRUE,
  mchoice = FALSE,
  ...
)

Arguments

formula

An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

The data frame for which the model formula is to be fitted. If a spatial model is to be fitted then the data frame should contain two columns containing the locations of the coordinates. See the coords argument below.

package

Which package is to be used in model fitting? Currently available packages are:

  • "spBayes": The model implemented is the marginal model with nugget effect using the spLM function.

  • "stan": The model implemented is the full spatial random effect model with nugget effect where the decay parameter has been assumed to be fixed.

  • "inla": The model fitted is the spatial random effect model with the nugget effect.

  • "none": In this case case, the argument model must be specified either as "lm" or "spat". See below.

Further details and more examples are provided in Chapter 6 of the book Sahu (2022).

model

Only used when the package has been chosen to be "none". It can take one of two values: either "lm" or "spat". The "lm" option is for an independent error regression model while the "spat" option fits a spatial model without any nugget effect.

coordtype

Type of coordinates: utm, lonlat or plain with utm (supplied in meters) as the default. Distance will be calculated in units of kilometer if this argument is either utm or lonlat. Euclidean distance will be calculated if this is given as the third type plain. If distance in meter is to be calculated then coordtype should be passed on as plain although the coords are supplied in UTM.

coords

A vector of size two identifying the two column numbers of the data frame to take as coordinates. Or this can be given as a matrix of number of sites by 2 providing the coordinates of all the data locations.

validrows

A vector of site indices which should be used for validation. This function does not allow some sites to be used for both fitting and validation. The remaining observations will be used for model fitting. The default NULL value instructs that validation will not be performed.

scale.transform

Transformation of the response variable. It can take three values: SQRT, LOG or NONE.

prior.beta0

A scalar value or a vector providing the prior mean for beta parameters.

prior.M

Prior precision value (or matrix) for beta. Defaults to a diagonal matrix with diagonal values 10^(-4).

prior.sigma2

Shape and scale parameter value for the gamma prior on 1/sigma^2, the precision.

prior.tau2

Shape and scale parameter value for the gamma prior on tau^2, the nugget effect.

phi

The spatial decay parameter for the exponential covariance function. Only used if the package is Stan or the model is a spatial model "spat" without nugget effect when the package is "none".

prior.phi.param

Lower and upper limits of the uniform prior distribution for \phi, the spatial decay parameter when the package is spBayes. If this is not specified the default values are chosen so that the effective range is uniformly distributed between 25% and 100% of the maximum distance between data locations.

prior.range

A length 2 vector, with (range0, Prange) specifying that P(\rho < \rho_0)=p_{\rho}, where \rho is the spatial range of the random field. If Prange is NA, then range0 is used as a fixed range value. If this parameter is unspecified then range0=0.90 * maximum distance and Prange =0.95. If instead a single value is specified then the range is set at the single value.

prior.sigma

A length 2 vector, with (sigma0, Psigma) specifying that P(\sigma > \sigma_0)=p_{\sigma}, where \sigma is the marginal standard deviation of the field. If Psigma is NA, then sigma0 is used as a fixed range value.

offset

Only used in INLA based modeling. Offset parameter. See documentation for inla.mesh.2d.

max.edge

Only used in INLA based modeling. See documentation for inla.mesh.2d.

cov.model

Only relevant for the spBayes package. Default is the exponential model. See the documentation for spLM in the package spBayes.

N

MCMC sample size. Default value 5000.

burn.in

How many initial iterations to discard. Default value 1000. Only relevant for MCMC based model fitting, i.e., when package is spBayes or Stan.

rseed

Random number seed that controls the starting point for the random number stream. A set value is required to help reproduce the results.

n.report

How many times to report in MCMC progress. This is used only when the package is spBayes.

no.chains

Number of parallel chains to run in Stan.

ad.delta

Adaptive delta controlling the behavior of Stan during fitting.

s.size

step size in the fitting process of Stan.

t.depth

Maximum allowed tree depth in the fitting process of Stan.

verbose

Logical scalar value: whether to print various estimates and statistics.

plotit

Logical scalar value: whether to plot the predictions against the observed values.

mchoice

Logical scalar value: whether model choice statistics should be calculated.

...

Any additional arguments that may be passed on to the fitting package.

Value

A list containing:

See Also

Bsptime for Bayesian spatio-temporal model fitting.

Examples


N <- 10
burn.in <- 5
n.report <- 2
a <- Bspatial(formula = mpg ~ wt, data = mtcars, package = "none", model = "lm",
    N = N)
summary(a)
plot(a)
print(a)
b <- Bspatial(formula = mpg ~ disp + wt + qsec + drat, data = mtcars,
    validrows = c(8, 11, 12, 14, 18, 21, 24, 28), N = N)
#' print(b)
summary(b)
## Illustration with the nyspatial data set
head(nyspatial)
## Linear regression model fitting
M1 <- Bspatial(formula = yo3 ~ xmaxtemp + xwdsp + xrh, data = nyspatial,
    mchoice = TRUE, N = N)
print(M1)
plot(M1)
a <- residuals(M1)
summary(M1)
## Spatial model fitting
M2 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp +
    xrh, data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4,
    mchoice = TRUE, N = N)
names(M2)
print(M2)
plot(M2)
a <- residuals(M2)
summary(M2)
## Fit model 2 on the square root scale
M2root <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, scale.transform = "SQRT")
summary(M2root)
## Spatial model fitting using spBayes
M3 <- Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, prior.phi = c(0.005,
        2), mchoice = TRUE, N = N, burn.in = burn.in, n.report = n.report)
summary(M3)

# Spatial model fitting using stan (with a small number of iterations)
M4 <- Bspatial(package = "stan", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, phi = 0.4, N = N,
    burn.in = burn.in, mchoice = TRUE)
summary(M4)


## K fold cross-validation for M2 only
set.seed(44)
x <- runif(n = 28)
u <- order(x)
# Here are the four folds
s1 <- u[1:7]
s2 <- u[8:14]
s3 <- u[15:21]
s4 <- u[22:28]
summary((1:28) - sort(c(s1, s2, s3, s4)))  ## check
v1 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s1,
    phi = 0.4, N = N)
v2 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s2,
    phi = 0.4, N = N)
v3 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s3,
    phi = 0.4, N = N)
v4 <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s4,
    phi = 0.4, N = N)
M2.val.table <- cbind(unlist(v1$stats), unlist(v2$stats), unlist(v3$stats),
    unlist(v4$stats))
dimnames(M2.val.table)[[2]] <- paste("Fold", 1:4, sep = "")
round(M2.val.table, 3)

## Model validation
s <- c(1, 5, 10)
M1.v <- Bspatial(model = "lm", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, N = N,
    burn.in = burn.in)
M2.v <- Bspatial(model = "spat", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
    data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s, phi = 0.4,
    N = N, burn.in = burn.in)
M3.v <- Bspatial(package = "spBayes", formula = yo3 ~ xmaxtemp + xwdsp +
    xrh, data = nyspatial, coordtype = "utm", coords = 4:5, validrows = s,
    prior.phi = c(0.005, 2), n.report = 2, N = N, burn.in = burn.in)
# Collect all the results
Mall.table <- cbind(unlist(M1.v$stats), unlist(M2.v$stats), unlist(M3.v$stats))
colnames(Mall.table) <- paste("M", c(1:3), sep = "")
round(Mall.table, 3)


if (require(INLA) & require(inlabru)) {
    N <- 10
    burn.in <- 5
    # Spatial model fitting using INLA
    M5 <- Bspatial(package = "inla", formula = yo3 ~ xmaxtemp + xwdsp + xrh,
        data = nyspatial, coordtype = "utm", coords = 4:5, mchoice = TRUE,
        N = N, burn.in = burn.in)
    summary(M5)
}


[Package bmstdr version 0.7.9 Index]