| xwalk_ags {ags} | R Documentation |
Crosswalk Municipality or District Statistics
Description
This function constructs time series of counts for Germany's municipalities (Gemeinden) and districts (Kreise).
Usage
xwalk_ags(
data,
ags,
time,
xwalk,
variables = NULL,
strata = NULL,
weight = NULL,
fuzzy_time = FALSE,
verbose = TRUE
)
Arguments
data |
A data frame or a data frame extension (e.g. a tibble). |
ags |
Name of the character variable (quoted) with municipality AGS (Gemeinden, 8 digits) or district AGS (Kreise, 5 digits). |
time |
Name of the variable (quoted) identifying the year (YYYY format). Values will be coerced to integers. |
xwalk |
Name of the crosswalk. The following crosswalks are available:
|
variables |
Either a vector of names (quoted) for
variables to interpolate or |
strata |
Vector of variable names (quoted) or |
weight |
Name of the interpolation weight or
|
fuzzy_time |
If |
verbose |
If |
Details
This function facilitates the use of crosswalks constructed by the BBSR for municipalities and districts in Germany (Milbert 2010). The crosswalks map one year's set of district/municipality identifiers to later year's identifiers and provide weights to perform area or population weighted interpolation.
All data rows with NAs in either the ags or time
variable are excluded. The same applies to all rows with a value in
ags or time that never appears in the crosswalk.
Fuzzy matching uses the absolute difference between the year reported in the data and a crosswalk year. If there is a tie, crosswalk years from before the year reported in the data are preferred.
If area or population weighted interpolation is requested (i.e., when
variables are supplied), the combination of the variables set
in ags, time and strata need to uniquely
identify a row in data.
Caution: Data from https://www.regionalstatistik.de/ sometimes includes
annual values for merged units (e.g., Städteregion Aachen, 05334)) and
for their former parts (Kreis Aachen, 05354 and Stadt Aachen, 05313).
When such data is crosswalked with fuzzy_time=TRUE and
interpolated, the final counts will be off by approximately factor 2.
The reason is that the final output is the sum of the interpolated counts
for the parts and the measured count of the merged unit.
Value
If interpolation is requested, the crosswalked and interpolated
data are returned. If interpolation is not requested, the data matched
with the crosswalk are returned. The following variables are added:
-
row_idrow number ofdatabefore matching. -
ags[*]the crosswalked AGS. -
year_xwthe matched year from the crosswalk. -
[*]_convthe interpolation weight. -
diffthe absolute difference betweenyear_xwandtime.
References
Milbert, Antonia. 2010. "Gebietsreformen–politische Entscheidungen und Folgen für die Statistik." BBSR-Berichte kompakt 6/2010. Bundesinsitut für Bau-, Stadt-und Raumfoschung.
Examples
data(btw_sn)
btw_sn_ags20 <- xwalk_ags(
data = btw_sn,
ags = "district",
time = "year",
xwalk = "xd20",
variables = c("voters", "valid"),
weight = "pop"
)
head(btw_sn_ags20)