GeoDistPSU {PracTools} | R Documentation |
Form PSUs based on geographic distances
Description
Combine geographic areas into primary sampling units to limit travel distances
Usage
GeoDistPSU(lat, long, dist.sw, max.dist, Input.ID = NULL)
Arguments
lat |
latitude variable in an input file. Must be in decimal format. |
long |
longitude variable in an input file. Must be in decimal format. |
dist.sw |
units for distance; either |
max.dist |
maximum distance allowed within a PSU between centroids of geographic units |
Input.ID |
ID field in the input file if present |
Details
GeoDistPSU
combines geographic secondary sampling units (SSUs), like cities or census block groups, into primary sampling units (PSUs) given a maximum distance allowed between the centroids of the SSUs within each grouped PSU. The input file must have one row for each geographic unit. If the input file does not have an ID field, the function will create a sequential ID that is appended to the output. The latitude and longitude input vectors define the centroid of each input SSU. The complete linkage method for clustering is used. GeoDistPSU
calls the functions distm
and distHaversine
from the geosphere
package to calculate the distances between centroids.
Value
A list with two components:
PSU.ID |
A data frame with the same number of rows as the input file. Column names are |
PSU.Info |
A data frame with the number of rows equal to the number of PSUs that are created. Column names are |
.
Author(s)
George Zipf, Richard Valliant
See Also
Examples
data(Test_Data_US)
g <- GeoDistPSU(Test_Data_US$lat,
Test_Data_US$long,
"miles", 100,
Input.ID = Test_Data_US$ID)
# Plot GeoDistPSU output
plot(g$PSU.Info$PSU.Mean.Longitude,
g$PSU.Info$PSU.Mean.Latitude,
col = 1:nrow(g$PSU.Info),
pch = 19,
main = "Plot of PSU Centers",
xlab = "Longitude",
ylab = "Latitude")
grid(col = "grey40")
# Plot GeoDistPSU output with map
## Not run:
# install package sf to run usmap_transform
library(ggplot2)
library(sp)
library(usmap)
# Transform PSUs into usmap projection
g.map <- cbind(long = g$PSU.Info$PSU.Mean.Longitude,
lat = g$PSU.Info$PSU.Mean.Latitude)
g.map <- as.data.frame(g.map)
g.proj <- usmap::usmap_transform(g.map,
input_names = c("long", "lat"),
output_names = c("Long", "Lat"))
usmap::plot_usmap(color = "gray") +
geom_point(data = g.proj,
aes(x = Long,
y = Lat))
# Create histogram of maximum distance
hist(g$PSU.Info$PSU.Max.Dist,
main = "Histogram of Maximum Within-PSU Distance",
xlab = "Distance",
ylab = "Frequency")
## End(Not run)