run.mig.mcmc {bayesMig} | R Documentation |
Run Markov chain Monte Carlo for parameters of net migration rate model
Description
Runs MCMCs for simulating the net migration rate of all countries of the world or for locations specified by users, using the Bayesian hierarchical model of Azose & Raftery (2015).
Usage
run.mig.mcmc(
output.dir,
nr.chains = 3,
iter = 50000,
thin = 1,
replace.output = FALSE,
annual = FALSE,
start.year = 1950,
present.year = 2020,
wpp.year = 2019,
my.mig.file = NULL,
sigma.c.min = 1e-04,
a.ini = NULL,
a.half.width = NULL,
mu.ini = NULL,
exclude.from.world = NULL,
pop.denom = 1,
seed = NULL,
parallel = FALSE,
nr.nodes = nr.chains,
buffer.size = 1000,
verbose = TRUE,
verbose.iter = 10,
...
)
Arguments
output.dir |
A file path pointing to the directory in which to store results. |
nr.chains |
An integer number of independent Markov chains to run. |
iter |
The number of iterations to run per Markov chain. |
thin |
Thinning interval. A chain with 1000 iterations thinned by 20 will return a final count of 50 iterations. |
replace.output |
If the specified output directory already exists, should it be overwritten? |
annual |
If |
start.year |
Start year for using historical data. |
present.year |
End year for using historical data. |
wpp.year |
Year for which WPP data is used if no user data is provided via |
my.mig.file |
File name containing user-specified historical time series of migration rates for all locations that should be included in the simulation. It should be a tab-separated file. For structure, see Details below. |
sigma.c.min , a.ini , mu.ini |
Settings for the parameters
of the model (see Azose & Raftery 2015), such as minimum value and initial values.
Initial values (*.ini) can be given as a vector of length |
a.half.width |
Half width for Metropolis proposals of the a parameter. This argument can greatly influence
the convergence and it is dependent on the scale of the data. By default it is set to 0.01 for 5-year data
defined as rate per population; to 0.03 for 5-year data defined as per 1000; to 0.3 for
annual data per population; to 0.5 for annual data per 1000. If the default does not
yield satisfactory results, use the function |
exclude.from.world |
Vector of location codes that should be excluded from estimating the hyperparameters. These would be for example small locations or locations with unusual patters. Note that location-specific parameters are generated for all locations, regardless of this setting. |
pop.denom |
Denominator used to generate the input migration rates. It is used to derive an appropriate scaler
for the priors and conditional distributions. Typically, this will be either 1 (default) if the rates are
defined as per population, or 1000, if the rates are per 1000 population.
Use this argument only if user-specified rates are supplied via the |
seed |
Seed of the random number generator. If |
parallel |
Whether to run code in parallel. |
nr.nodes |
Relevant only if |
buffer.size |
Buffer size (in number of iterations) for keeping data in the memory before flushing to disk. |
verbose |
Whether or not to print status updates to console window while the code is running. |
verbose.iter |
If verbose is TRUE, the number of iterations to wait between printing updates. |
... |
Additional parameters to be passed to the function |
Details
The function creates an object of class bayesMig.mcmc.meta
and
stores it in output.dir
. It launches nr.chains
MCMCs, either sequentially or
in parallel. Parameter traces of each chain are stored as ASCII files in a subdirectory
of output.dir
, called mc
x where x is the identifier of that chain.
There is one file per parameter, named after the parameter with the suffix “.txt”.
Location-specific parameters have the suffix _country
c where c is the location code.
In addition to the trace files, each mc
x directory contains the object
bayesMig.mcmc
in binary format.
All chain-specific files are written onto disk after the first, last and each
i
-th (thinned) iteration, where i
is given by the argument buffer.size
.
By default (if no data is passed via the my.mig.file
argument), the function
loads observed data (further denoted as WPP dataset), from the migration
and pop
datasets in the wppx
package where x
is
the wpp.year
. Net migration rates are computed as migration(t
) / (population(t_e
) - migration(t
))
where t_e
means the end of time period t
. For an annual simulation and
wpp.year
set to 2022, t = t_e
because the population in year t
is considered at the end of the year. If wpp.year
is smaller than 2022 and annual
is TRUE
the default dataset is interpolated from 5-year data.
The argument my.mig.file
can be used to overwrite the default data. It should be a tab-separated file.
If it is used, it should contain net migration rates for all locations to be used in the simulation, as no WPP data is used
in such a case. The structure of the file has the same format as the migration
dataset,
but the values should be rates (instead of counts). Use the argument pop.denom
to define the scale of the
denominator in these rates, i.e. if the rates are to be interpreted as per population (default) or some other scale.
Each row in the my.mig.file
file corresponds to a location. It does not have
to be necessarily a country - it can be for example a subnational unit. It must contain columns
“country_code” or “code” (unique identifier of the location), “name”, and columns representing
5-year time intervals (if annual
is FALSE
), e.g., “1995-2000”, “2000-2005” etc., or single years
(if annual
is TRUE
). An example dataset of annual net migration rates for US states is included in the package,
see example below.
Optionally, the my.mig.file
can contain columns called “first.observed” and/or “last.observed”, containing
for each location the year of the first and last observation, respectively. In such a case, any data
before and after those time points will be ignored. Furthermore, the function mig.predict
fills in the missing values
after the last observation, using the median of the BHM procedure.
If there are countries or locations that should be excluded from influencing the hyperparameters,
for example small countries or locations with unique migration patterns, their codes
should be included in the argument exclude.from.world
. These locations will still get
their parameters simulated and thus, will be included in a projection. Alternatively
if my.mig.file
is used, these locations can be determined using an additional column, called
“include_code”. Value 2 means the location is included in the BHM; value 1 means it's
excluded but location-specific parameters are generated; value 0 means the location is ignored.
Value
An object of class bayesMig.mcmc.set
which is a list with two components:
meta |
An object of class |
mcmc.list |
A list of objects of class |
References
Azose, J. J., & Raftery, A. E. (2015). Bayesian probabilistic projection of international migration. Demography, 52(5), 1627-1650.
See Also
get.mig.mcmc
, summary.bayesMig.mcmc.set
, mig.partraces.plot
,
mig.pardensity.plot
, mig.predict
Examples
# Toy simulation for US states
us.mig.file <- file.path(find.package("bayesMig"), "extdata", "USmigrates.txt")
sim.dir <- tempfile()
m <- run.mig.mcmc(nr.chains = 3, iter = 100, thin = 1, my.mig.file = us.mig.file,
annual = TRUE, output.dir = sim.dir)
summary(m)
summary(m, "Washington")
mig.partraces.plot(m)
mig.partraces.cs.plot("California", m)
# later one can access the object from disk
m <- get.mig.mcmc(sim.dir)
unlink(sim.dir, recursive = TRUE)
# For a country-level simulation, see example in ?bayesMig.