gwr.sel {spgwr} | R Documentation |
Crossvalidation of bandwidth for geographically weighted regression
Description
The function finds a bandwidth for a given geographically weighted regression by optimzing a selected function. For cross-validation, this scores the root mean square prediction error for the geographically weighted regressions, choosing the bandwidth minimizing this quantity.
Usage
gwr.sel(formula, data=list(), coords, adapt=FALSE, gweight=gwr.Gauss,
method = "cv", verbose = TRUE, longlat=NULL, RMSE=FALSE, weights,
tol=.Machine$double.eps^0.25, show.error.messages = FALSE)
Arguments
formula |
regression model formula as in |
data |
model data frame as in |
coords |
matrix of coordinates of points representing the spatial positions of the observations |
adapt |
either TRUE: find the proportion between 0 and 1 of observations to include in weighting scheme (k-nearest neighbours), or FALSE — find global bandwidth |
gweight |
geographical weighting function, at present
|
method |
default "cv" for drop-1 cross-validation, or "aic" for AIC optimisation (depends on assumptions about AIC degrees of freedom) |
verbose |
if TRUE (default), reports the progress of search for bandwidth |
longlat |
TRUE if point coordinates are longitude-latitude decimal degrees, in which case distances are measured in kilometers; if x is a SpatialPoints object, the value is taken from the object itself |
RMSE |
default FALSE to correspond with CV scores in newer references (sum of squared CV errors), if TRUE the previous behaviour of scoring by LOO CV RMSE |
weights |
case weights used as in weighted least squares, beware of scaling issues — only used with the cross-validation method, probably unsafe |
tol |
the desired accuracy to be passed to |
show.error.messages |
default FALSE; may be set to TRUE to see error messages if |
Details
If the regression contains little pattern, the bandwidth will converge to the upper bound of the line search, which is the diagonal of the bounding box of the data point coordinates for “adapt=FALSE”, and 1 for “adapt=TRUE”; see the simulation block in the examples below.
Value
returns the cross-validation bandwidth.
Note
Use of method="aic" results in the creation of an n by n matrix, and should not be chosen when n is large.
Author(s)
Roger Bivand Roger.Bivand@nhh.no
References
Fotheringham, A.S., Brunsdon, C., and Charlton, M.E., 2002, Geographically Weighted Regression, Chichester: Wiley; Paez A, Farber S, Wheeler D, 2011, "A simulation-based study of geographically weighted regression as a method for investigating spatially varying relationships", Environment and Planning A 43(12) 2992-3010; http://gwr.nuim.ie/
See Also
Examples
data(columbus, package="spData")
gwr.sel(CRIME ~ INC + HOVAL, data=columbus,
coords=cbind(columbus$X, columbus$Y))
gwr.sel(CRIME ~ INC + HOVAL, data=columbus,
coords=cbind(columbus$X, columbus$Y), gweight=gwr.bisquare)
## Not run:
data(georgia)
set.seed(1)
X0 <- runif(nrow(gSRDF)*3)
X1 <- matrix(sample(X0), ncol=3)
X1 <- prcomp(X1, center=FALSE, scale.=FALSE)$x
gSRDF$X1 <- X1[,1]
gSRDF$X2 <- X1[,2]
gSRDF$X3 <- X1[,3]
yrn <- rnorm(nrow(gSRDF))
gSRDF$yrn <- sample(yrn)
bw <- gwr.sel(yrn ~ X1 + X2 + X3, data=gSRDF, method="cv", adapt=FALSE, verbose=FALSE)
bw
bw <- gwr.sel(yrn ~ X1 + X2 + X3, data=gSRDF, method="aic", adapt=FALSE, verbose=FALSE)
bw
bw <- gwr.sel(yrn ~ X1 + X2 + X3, data=gSRDF, method="cv", adapt=TRUE, verbose=FALSE)
bw
bw <- gwr.sel(yrn ~ X1 + X2 + X3, data=gSRDF, method="aic", adapt=TRUE, verbose=FALSE)
bw
## End(Not run)