metropolis.krige {krige} | R Documentation |
Sampling Technique Using Metropolis-Hastings
Description
This function performs Metropolis-Hastings sampling for a linear model specified over point-referenced geospatial data. It returns MCMC iterations, with which results of the geospatial linear model can be summarized.
Usage
metropolis.krige(
formula,
coords,
data,
n.iter = 100,
powered.exp = 2,
n.burnin = 0,
y,
X,
east,
north,
na.action = "na.fail",
spatial.share = 0.5,
range.share = 0.5,
beta.var = 10,
range.tol = 0.05,
b.tune = 1,
nugget.tune = 10,
psill.tune = 1,
distance.matrix = FALSE,
progress.bar = "message",
accept.rate.warning = TRUE
)
Arguments
formula |
An object of class |
coords |
A matrix of coordinates for all observations or a vector of variable
names indicating the coordinates variables in the data. Alternatively, the
coordinates can also be specified seperately using |
data |
An data frame containing the variables in the model. |
n.iter |
Number of MCMC iterations (defaults to 100). |
powered.exp |
This exponent, which must be greater than 0 and less than or equal to 2, specifies a powered exponential correlation structure for the data. One widely used specification is setting this to 1, which yields an exponential correlation structure. Another common specification is setting this to 2 (the default), which yields a Gaussian correlation structure. |
n.burnin |
Number of iterations that will be discarded for burnin (warmup).
The number of burnin should not be larger than |
y |
Alternative specification for the outcome variable that is used in the kriging model. If formula is used, this argument will be suppressed. |
X |
Alternative specification for the matrix of explanatory variables used in the kriging model. Different forms of the variables such as transformations and interactions also need to be specified accordingly beforehand. |
east |
Alternative specification for the vector of eastings for all observations. |
north |
Alternative specification for the vector of northing for all observations. |
na.action |
A function which indicates what should happen when the data contain NAs. The default is "na.fail." Another possible value is "na.omit." |
spatial.share |
Prior for proportion of unexplained variance that is spatial in nature. Must be greater than 0 and less than 1. Defaults to an even split, valued at 0.5. |
range.share |
Prior for the effective range term, as a proportion of the maximum distance in the data. Users should choose the proportion of distance at which they think the spatial correlation will become negligible. Must be greater than 0. Values greater than 1 are permitted, but users should recognize that this implies that meaningful spatial correlation would persist outside of the convex hull of data. Defaults to half the maximum distance, valued at 0.5. |
beta.var |
Prior for the variance on zero-meaned normal priors on the regression coefficients. Must be greater than 0. Defaults to 10. |
range.tol |
Tolerance term for setting the effective range. At the distance where the spatial correlation drops below this term, it is judged that the effective range has been met. The default value is the commonly-used 0.05. Must be greater than 0 and less than 1. |
b.tune |
Tuning parameter for candidate generation of regression coefficients that must be greater than 0. A value of 1 means that draws will be based on the variance-covariance matrix of coefficients from OLS. Larger steps are taken for values greater than 1, and smaller steps are taken for values from 0 to 1. Defaults to 1.0. |
nugget.tune |
Tuning parameter for candidate generation of the nugget term
( |
psill.tune |
Tuning parameter for candidate generation of the partial sill
term ( |
distance.matrix |
Logical value indicates whether to save the distance matrix
in the output object. Saving distance matrix can save time for furthur use such as
in |
progress.bar |
Types of progress bar. The default is "message" and will report variance terms. Other possible values are "TRUE" (simple percentage) and "FALSE" (suppress the progress bar). |
accept.rate.warning |
Logical values indicating whether to show the warnings
when the acceptance rates are too high or too low. Defaults to |
Details
Analysts should use this function if they want to estimate a linear regression model in which each observation can be located at points in geographic space. That is, each observation is observed for a set of coordinates in eastings & northings or longitude & latitude.
Researchers must specify their model in the following manner: formula
should be a symbolic description of the model to be fitted; it is similar to
R
model syntax as used in lm()
. In addition, a matrix of
coordinates must be specified for the geospatial model in coords
. coords
should be a matrix with two columns that specify west-east and north-south
coordinates, respectively (ideally eastings and northings but possibly longitude
and latitude). It can also be a vector of strings indicating the variables names
of the coordinates in the data
. data
should be a data frame
containing the variables in the model including both the formula and coordinates
(if only the names are provided). Alternatively, users can also specify the
variables using y
, X
, east
, and north
for outcome,
explanatory, west-east coordinates, and north-south coordinates variables,
respectively. This alternative specification is compatible with the one used
in the early version of this package.
n.iter
is the number of iterations to sample from the posterior distribution
using the Metropolis-Hastings algorithm. This defaults to 100 iterations, but
many more iterations would normally be preferred. n.burnin
is set to 0
by default to preserve all the iterations since the kriging model usually takes
a relatively long time to run. Users can set a number for burnin or use burnin
function afterwards to discard the burnin period. The output of the function
prints the proportion of candidate values for the coefficients and for the
variance terms accepted by the Metropolis-Hastings algorithm. Particularly
low or high acceptance rates respectively may indicate slow mixing (requiring
more iterations) or a transient state (leading to nonconvergence), so additional
messages will print for extreme acceptance rates. Users may want to adjust the
tuning parameters b.tune
, nugget.tune
, or psill.tune
,
or perhaps the tolerance parameter range.tol
if the acceptance rate
is too high or too low.
The function returns a "krige" list object including the output MCMC matrix
of sampled values from the posterior distribution as well as the record of
function arguments, model frame, acceptance rates, and standing parameters.
Users can use the generic summary
function to summarize the results or
extract the elements of the object for further use.
Value
An object of class krige
that includes the output MCMC matrix
of sampled values from the posterior distribution as well as the record of
function arguments, model frame, acceptance rates, and standing parameters.
References
Jeff Gill. 2020. Measuring Constituency Ideology Using Bayesian Universal Kriging.
State Politics & Policy Quarterly. doi:10.1177/1532440020930197
Examples
## Not run:
# Summarize example data
summary(ContrivedData)
# Initial OLS model
contrived.ols<-lm(y~x.1+x.2,data=ContrivedData)
# summary(contrived.ols)
# Set seed
set.seed(1241060320)
#For simple illustration, we set to few iterations.
#In this case, a 10,000-iteration run converges to the true parameters.
#If you have considerable time and hardware, delete the # on the next line.
#10,000 iterations took 39 min. with 8 GB RAM & a 1.5 GHz Quad-Core processor.
M <- 100
#M<-10000
contrived.run <- metropolis.krige(y ~ x.1 + x.2, coords = c("s.1","s.2"),
data = ContrivedData, n.iter = M, n.burnin=20, range.tol = 0.05)
# Alternatively, use burnin() after estimation
#contrived.run <- burnin(contrived.run, n.burnin=20)
# Summarize the results and examine results against true coefficients
summary(contrived.run)
(TRUTH<-c(0.5,2.5,0.5,0,1,2))
# Extract the MCMC matrix of the posterior distribution
contrived.run.mat <- mcmc.samples(contrived.run)
head(contrived.run.mat)
# Diagnostics
geweke(contrived.run, early.prop=0.5)
heidel.welch(contrived.run)
# Semivariogram
### Semivariance
semivariance(contrived.run)
### Plot semivariogram
semivariogram(contrived.run)
### Alternatively, use generic plot() on a krige object
plot(contrived.run)
## End(Not run)