pop.aggregate {bayesPop} | R Documentation |
Aggregation of Population Projections
Description
Aggregation of existing countries' population projections into projections of given regions, and accessing such aggregations.
Usage
pop.aggregate(pop.pred, regions,
input.type = c("country", "region"), name = input.type,
inputs = list(e0F.sim.dir = NULL, e0M.sim.dir = "joint_", tfr.sim.dir = NULL),
my.location.file = NULL, verbose = FALSE, ...)
get.pop.aggregation(sim.dir = NULL, pop.pred = NULL, name = NULL,
write.to.cache = TRUE)
pop.aggregate.subnat(pop.pred, regions, locations, ..., verbose = FALSE)
Arguments
pop.pred |
Object of class |
regions |
Vector of numerical codes of regions. It should correspond to values in the column “country_code” in the |
input.type |
There are two methods for aggregating projections depending on the type of inputs, “country”- and “region”-based, see Details. |
name |
Name of the aggregation. It becomes a part of a directory name where aggregation results are stored. |
inputs |
This argument is only used when the “region”-based method is selected. It is a list of inputs of probabilistic components of the projection:
|
my.location.file |
User-defined location file that can contain other agreggation groups than the default UN location file. It should have the same structure as the |
verbose |
Logical switching log messages on and off. |
sim.dir |
Simulation directory where aggregation is stored. It is the same directory used for creating the |
write.to.cache |
Logical controlling if functions operating on this object are allowed to write into its cache (see Details of |
locations |
Name of a tab-delimited file that contains definitions of the sub-regions. It should be the same file as used for the |
... |
Additional arguments. For a country-type aggregation, it can be logical |
Details
Function pop.aggregate
triggers an aggregations over countries while function pop.aggregate.subnat
is used for aggregation over sub-regions to a country. The following details refer to the use of pop.aggregate
. For sub-national aggregation see Example in pop.predict.subnat
.
The dataset UNlocations
or my.location.file
is used to determine countries to be aggregated, in particular the field “location_type” of the entries with “country_code” given in the regions
argument. One can aggregate over the following location types: Type 0 means aggregating all countries of the world (or in the file), type 2 is aggregating over continents, type 3 is aggregating over regions within continents, and any other integer (except 4) correponds to user-defined aggregations. Note that type 4 is reserved as a location type of countries and thus, all aggregations are performed over entries of this type. For type 2, countries are matched using the “area_code” column; for type 3 the matching is done using the “reg_code” column of the UNlocations
dataset. E.g., if regions=908
(Europe) which has location type 2 in the default UNlocations
dataset, all countries are aggregated for which values of 908 are found in the “area_code” column. If the location type is other than 0, 2, 3 and 4, there must be a column in the file called “agcode_x
” with x
being the location type. This column is then used to match the countries to be aggregated.
Consider the following example. Say we want to pair four countries (Germany [DE], France [FR], Netherlands [NL], Italy [IT]) in two different ways, so we have two overlapping groupings, each of which has two groups (A,B):
group A = (DE, FR), group B = (NL, IT)
group A = (DE, NL), group B = (FR, IT)
Then, my.location.file
should have the following entries:
country_code | name | location_type | agcode_98 | agcode_99 |
1001 | grouping1_groupA | 98 | -1 | -1 |
1002 | grouping1_groupB | 98 | -1 | -1 |
1003 | grouping2_groupA | 99 | -1 | -1 |
1004 | grouping2_groupB | 99 | -1 | -1 |
276 | Germany | 4 | 1001 | 1003 |
250 | France | 4 | 1001 | 1004 |
258 | Netherlands | 4 | 1002 | 1003 |
380 | Italy | 4 | 1002 | 1004 |
1005 | all | 0 | -1 | -1 |
The “country_code” of the groups is user-specific, but it must be unique within the file. Values of “country_code” for countries must match those in the prediction object. To run the aggregation for the four groups above we set regions=1001:1004
. Having “location_type” being 98 and 99, it is expected the file to have columns “agcode_98” and “agcode_99” containing assignements to each of the two groupings. Values in this columns corresponding to groups are not used and thus can have any value. For aggregating over all four countries, set regions=1005
which has “location_type” equal 0 and thus, it is aggregated over all entries with “location_type” equals 4.
There are two methods available for generating aggregations of population projection:
- Country-based Method
-
Aggregations are created by summing trajectories over countries of the given region.
- Region-based Method
-
The aggregation is generated using the same algorithm as population projections for single countries (function
pop.predict
), but it operates on aggregated input components. These are created as follows. Herec
denotes countries over which we aggregate a regionR
,s \in \{m, f\}
,a
, andt
denote sex, age category and time, respectively.t=P
denotes the present year of the prediction.N_{s,a,t}^c
andM_{s,a,t}^c
, respectively, denotes the historical population count and the Bayesian predictive median of population, respectively, of sexs
, in age categorya
at timet
for countryc
(refer to the links in parentheses for description of the data):- Initial sex and age-specific population (popM, popF):
N_{s,a,t=P}^R = \sum_c N_{s,a,t=P}^c
- Sex and age-specific death rates (mxM, mxF):
mx_{s,a,t}^R = \frac{\sum_c(mx_{s,a,t}^c \cdot N_{s,a,t})}{\sum_c N_{s,a,t}}
- Sex ratio at birth (srb):
SRB_t^R = \frac{\sum_c M_{s=m,a=1,t}^c}{\sum_c M_{s=f,a=1,t}^c}
- Percentage age-specific fertility rate (pasfr):
PASFR_{a,t}^R = \frac{\sum_c(PASFR_{a,t}^c \cdot M_{s=f,a,t})}{\sum_c M_{s=f,a,t}}
- Migration code and start year (mig.type):
Aggregated migration code is the code of maximum counts over aggregated countries weighted by
N_{t=P}^c
. Migration start year is the maximum of start years over aggregated countries.- Sex and age-specific migration (migM, migF):
mig_{s,a,t}^R = \sum_c mig_{s,a,t}^c
- Probabilistic projection of life expectancy:
We assume an aggregation of life expectancy for the given regions was generated prior to this call, using the
run.e0.mcmc.extra
ande0.predict.extra
functions of the bayesLife package.- Probabilistic projection of total fertility rate:
We assume an aggregation of total fertility for the given regions was generated prior to this call, using the
run.tfr.mcmc.extra
andtfr.predict.extra
functions of the bayesTFR package.
Results of the aggregations are stored in the same top directory as the pop.pred
object, in a sudirectory called ‘aggregations_
name’. They can be accessed using the function get.pop.aggregation
. Note that multiple runs of this function with the same name will overwrite previous aggregations results of the same name.
Value
Object of class bayesPop.prediction
containing the aggregated results. In addition it contains elements aggregation.method
giving the input.type
used, and aggregated.countries
which is a list of countries aggregated for each region.
Author(s)
Hana Sevcikova, Adrian Raftery
References
H. Sevcikova, A. E. Raftery (2016). bayesPop: Probabilistic Population Projections. Journal of Statistical Software, 75(5), 1-29. doi:10.18637/jss.v075.i05
See Also
pop.predict
, tfr.predict.extra
, e0.predict.extra
Examples
## Not run:
sim.dir <- tempfile()
pred <- pop.predict(countries=c(528,218,450), output.dir=sim.dir)
aggr <- pop.aggregate(pred, 900) # aggregating World (i.e. all countries available in pred)
pop.trajectories.plot(aggr, 900, sum.over.ages=TRUE)
# countries over which we aggregated:
subset(UNlocations, country_code %in% aggr$aggregated.countries[["900"]])
unlink(sim.dir, recursive=TRUE)
## End(Not run)