R: Generate posterior trajectories of net migration rates

mig.predict {bayesMig}

R Documentation

Generate posterior trajectories of net migration rates

Description

Using the posterior parameter samples simulated by run.mig.mcmc, generate posterior trajectories for the net migration rates for all countries of the world, or all locations included in the estimation. This code does not adjust trajectories to ensure that net migration counts over all countries sum to zero.

Usage

mig.predict(
  mcmc.set = NULL,
  end.year = 2100,
  sim.dir = NULL,
  replace.output = FALSE,
  start.year = NULL,
  nr.traj = NULL,
  thin = NULL,
  burnin = 20000,
  use.cummulative.threshold = FALSE,
  ignore.gcc.in.threshold = FALSE,
  post.last.observed = c("obsdata", "alldata", "impute"),
  save.as.ascii = 0,
  output.dir = NULL,
  seed = NULL,
  verbose = TRUE,
  ...
)

Arguments

`mcmc.set`	Object of class `bayesMig.mcmc.set` corresponding to sampled parameter values for net migration model. If it is `NULL`, the object is loaded from the directory specified in `sim.dir`
`end.year`	End year of the prediction
`sim.dir`	Directory with MCMC simulation results. It should be the same as the `output.dir` argument in `run.mig.mcmc`
`replace.output`	Logical value. If `TRUE`, existing predictions in `output.dir` will be replaced by results of this run.
`start.year`	Start year of the prediction, i.e. the first predicted year. By default the prediction is started at the next time period after `present.year` set in the estimation step. If `start.year` is smaller than the default, the behavior is controlled by the `post.last.observed` argument: Either data post `start.year` is ignored (default) or the projection is set to the available data (`post.last.observed = "a"`).
`nr.traj`	Number of trajectories to be generated. If `NULL`, the argument `thin` is taken to determine the number of trajectories. If both are `NULL`, the number of trajectories corresponds to the size of the parameter sample.
`thin`	Thinning interval used for determining the number of trajectories. Only relevant if `nr.traj` is `NULL`.
`burnin`	Number of iterations to be discarded from the beginning of the parameter traces.
`use.cummulative.threshold`	If `TRUE` historical cummulative thresholds are applied to avoid sampling rates that are too extreme. The thresholds are derived over prior rates of all locations. As a time span for deriving the limits on projected rates, at each projected time point, six prior time periods are used in a 5-year simulation, corresponding to 30 years in an annual simulation. In a national simulation, prior rates of GCC countries (plus Western Sahara and Djibouti) are excluded when deriving thresholds for non-GCC countries. If this option is used in a non-country simulation, e.g. in a sub-national settings, set the `ignore.gcc.in.threshold` argument to `TRUE`.
`ignore.gcc.in.threshold`	If `use.cummulative.threshold` is `TRUE`, by default the GCC countries (plus Western Sahara and Djibouti) identified by numerical codes of the countries are excluded from computing the historical cummulative thresholds for non-GCC countries. If this argument is `TRUE`, this distinction is not made. It is important to set it to `TRUE` in a sub-national simulation to avoid any random overlaps of UN codes and user-defined codes.
`post.last.observed`	If a user-specific data file was used during estimation and the data contained the “last.observed” column, this argument determines how to treat the time periods between the last observed point and the start year of the prediction, for locations where there is a gap between them, or if short-term predictions were included in the file. It is also relevant if `start.year` is set to a smaller value than `present.year` in the estimation. Possible values are: “obsdata” or “o” (default) uses any non-missing observed data provided in the data file during estimation, up to the time point defined by the argument `start.year` (excluding the start year itself). “alldata” or “a” would similarly use the provided data but would use all data, even if it goes beyond the start year. This allows to use short-term deterministic projections for locations where it is available. “impute” or “i” would ignore all data beyond the last observed data point and impute the missing time periods.
`save.as.ascii`	Either a number determining how many trajectories should be converted into an ASCII file, or 'all' in which case all trajectories are converted. It should be set to 0 if no conversion is desired. If this argument is larger than zero, the resulting file can be used as input into population projection via bayesPop, see Details.
`output.dir`	Directory into which the resulting prediction object and the trajectories are stored. If it is `NULL`, it is set to either `sim.dir`, or to `output.dir` of `mcmc.set$meta` if `mcmc.set` is given.
`seed`	Seed of the random number generator. If `NULL` no seed is set. Can be used to generate reproducible projections.
`verbose`	Logical value. Switches log messages on and off.
`...`	Further arguments passed to the underlying functions.

Details

The trajectories of net migration rates for each location are generated using the model of Azose & Raftery (2015). Parameter samples simulated via run.mig.mcmc are used from all chains, from which the given burnin was discarded. They are evenly thinned to match nr.traj or using the thin argument. Such thinned parameter traces, collapsed into one chain, if they do not already exist, are stored on disk into the sub-directory ‘thinned_mcmc_t_b’ where t is the value of thin and b the value of burnin.

The projection is run for all missing values before the present year, if any. Medians over the trajectories are used as imputed values and the trajectories are discarded. The process then continues by projecting the future values where all generated trajectories are kept.

A special case is when the argument start.year is given that is smaller than or equal to the present year. In such a case, imputed missing values before present year are treated as ordinary predictions (trajectories are kept). If post.last.observed is “a”, all historical data between start year and present year are used as projections.

The resulting prediction object is saved into ‘{output.dir}/predictions’. Trajectories for all locations are saved into the same directory in a binary format, one file per location. At the end of the projection, if save.as.ascii is larger than 0, the function converts the given number of trajectories into a CSV file, called ‘ascii_trajectories.csv’ also located in the ‘predictions’ directory. The converted trajectories are selected by equal spacing. In addition to the converted trajectories, two summary files are created: one in a user-friendly format, the other using a UN-specific coding, as described in mig.write.projection.summary.

If it is desired to use these predictions as input to population projections in bayesPop, enter the full file path of the ‘ascii_trajectories.csv’ file into the inputs argument of bayesPop::pop.predict as item migtraj and set the argument mig.is.rate appropriately.

Value

Object of class bayesMig.prediction which is a list with components containing details of the prediction. Key result component is an array of quantiles with dimensions (number of locations) x (number of computed quantiles) x (number of projected time points). First time point in the sequence is not a projection, but the last observed time period.

Other key result components include traj.mean.sd, a summary of means and standard deviations for each country at each time point. See bayesTFR.prediction for more detail.

References

Azose, J. J., & Raftery, A. E. (2015). Bayesian probabilistic projection of international migration. Demography, 52(5), 1627-1650. doi:10.1007/s13524-015-0415-0.

Azose, J.J., Ševčíková, H., Raftery, A.E. (2016): Probabilistic population projections with migration uncertainty. Proceedings of the National Academy of Sciences 113:6460–6465. doi:10.1073/pnas.1606119113.

Examples

# Toy simulation for US states
us.mig.file <- file.path(find.package("bayesMig"), "extdata", "USmigrates.txt")
sim.dir <- tempfile()
m <- run.mig.mcmc(nr.chains = 2, iter = 30, thin = 1, my.mig.file = us.mig.file, 
        output.dir = sim.dir, present.year = 2017, annual = TRUE)

# Prediction
pred <- mig.predict(sim.dir = sim.dir, burnin = 5, end.year = 2050)
# here unrealistic results since this is a toy simulation 
mig.trajectories.plot(pred, "Hawaii", pi = 80, ylim = c(-0.02, 0.02)) 
mig.trajectories.table(pred, "Hawaii")
summary(pred, "California")

# view locations included in the simulation
get.countries.table(pred)

unlink(sim.dir, recursive = TRUE)
# For projections on national level, see ?bayesMig.

[Package bayesMig version 0.4-6 Index]