GeoDistMOS {PracTools}R Documentation

Split geographic PSUs based on a measure of size threshold


Split geographic PSUs into new geographically contiguous PSUs based on a maximum measure of size for each PSU


   GeoDistMOS(lat, long, psuID, n, MOS.var, MOS.takeall = 1, Input.ID = NULL)



latitude variable in an input file. Must be in decimal format.


longitude variable in an input file. Must be in decimal format.


PSU Cluster ID from an input file.


Sample size of PSUs; may be a preliminary value used in the computation to identify certainty PSUs


Variable used for probability proportional to size sampling


Threshold relative measure of size value for certainties; must satisfy 0 < MOS.takeall <= 1


ID variable from the input file


GeoDistMOS splits geographic primary sampling units (PSUs) in the input object based on a variable which is used to create the measure of size for each PSU (MOS.var). The goal is to create PSUs of similarly sized MOS. The input file should have one row for each geographic unit, i.e. secondary sampling unit (SSU), with a PSU ID assigned. The latitude and longitude input vectors define the centroid of each input SSU. The complete linkage method for clustering is used. Accordingly, PSUs are split on a distance metric and not on the MOS threshold value. GeoDistMOS calls the function inclusionprobabilities from the sampling package to calculate the inclusion probability for each SSU within a PSU and distHaversine from the geosphere package to calculate the distances between centroids.


A list with two components:


A data frame containing the SSU ID value in character format (Input.ID), the original PSU ID (psuID.orig), and the new PSU ID after splitting for the maximum measure of size (


A data frame containing the new PSU ID ( after splitting for the maximum Measure of Size, the inclusion probability of the PSU ID given the input sample size n (psuID.prob), the measure of size of the new PSU (MOS), the number of SSUs in the new PSU ID (Number.SSUs), and the means of the SSUs latitudes and longitudes that were combined to form the new PSU (PSU.Mean.Latitude and PSU.Mean.Longitude).


George Zipf, Richard Valliant

See Also

GeoDistPSU, GeoMinMOS



   # Create PSU ID with GeoDistPSU
g <- GeoDistPSU(Test_Data_US$lat,
                Input.ID = Test_Data_US$ID)
   # Append PSU ID to input file
Test_Data_US <- dplyr::inner_join(Test_Data_US, g$PSU.ID, by=c("ID" = "Input.file.ID"))

   # Split PSUs with MOS above 0.80
m <- GeoDistMOS(lat         = Test_Data_US$lat,
                long        = Test_Data_US$long,
                psuID       = Test_Data_US$psuID,
                n           = 15,
                MOS.var     = Test_Data_US$Amount,
                MOS.takeall = 0.80,
                Input.ID    = Test_Data_US$ID)

   # Create histogram of Measure of Size Values
     breaks = seq(0, 1, 0.1),
     main = "Histogram of PSU Inclusion Probabilities (Certainties = 1)",
     xlab = "Inclusion Probability",
     ylab = "Frequency")

[Package PracTools version 1.5 Index]