R: estimate death registration coverage using the GGB method

ggb {DDM}

R Documentation

estimate death registration coverage using the GGB method

Description

Given two censuses and an average annual number of deaths in each age class between censuses, we can use stable population assumptions to estimate the degree of underregistration of deaths. The method is based on finding a best-fitting linear relationship between two modeled parameters (right term and left term), but the fit, and resulting coverage estimate, depend on exactly which age range is taken. This function either finds a nice age range for you automatically, or you can specify an exact vector of ages.

Usage

ggb(X, minA = 15, maxA = 75, minAges = 8, exact.ages = NULL,
  deaths.summed = FALSE)

Arguments

`X`	`data.frame` with columns, `$pop1`, `$pop2`, `$deaths`, `$date1`, `$date2`, `$age`, and `$cod` (if there are more than 1 region/sex/intercensal period).
`minA`	the lowest age to be included in search
`maxA`	the highest age to be included in search (the lower bound thereof)
`minAges`	the minimum number of adjacent ages to be used in estimating
`exact.ages`	optional. A user-specified vector of exact ages to use for coverage estimation
`deaths.summed`	logical. is the deaths column given as the total per age in the intercensal period (`TRUE`). By default we assume `FALSE`, i.e. that the average annual was given.

Details

Census dates can be given in a variety of ways: 1) using Date classes, and column names $date1 and $date2 (or an unambiguous character string of the date, like, "1981-05-13") or 2) by giving column names "day1","month1","year1","day2","month2","year2" containing integers. If only year1 and year2 are given, then we assume January 1 dates. If year and month are given, then we assume dates on the first of the month. If you want coverage estimates for a variety of intercensal periods/regions/by sex, then stack them, and use a variable called $cod with unique values for each data chunk. Different values of $cod could indicate sexes, regions, intercensal periods, etc. The $deaths column should refer to the average annual deaths for each age class in the intercensal period. Sometimes one uses the arithmetic average of recorded deaths in each age, or simply the average of the deaths around the time of census 1 and census 2. To identify an age-range in the traditional visual way, see ggbChooseAges(), when working with a single year/sex/region of data. The automatic age-range determination feature of this function tries to implement an intuitive way of picking ages that follows the advice typically given for doing so visually. We minimize the square of the average squared residual between the fitted line and right term.

Value

a data.frame with columns for the coverage coefficient $coverage, the minimum $lower and maximum $upper of the age range on which it is based. $a and $b give the intercept and slope of the line on which the coverage estimate is based. $delta, $k1, and $k2 are further derived quantities that may be interesting for advanced users. Rows indicate data partitions, as indicated by the optional $cod variable.

References

Hill K. Estimating census and death registration completeness. Asian and Pacific Population Forum. 1987; 1:1-13.

Brass, William, 1975. Methods for Estimating Fertility and Mortality from Limited and Defective Data, Carolina Population Center, Laboratory for Population Studies, University of North Carolina, Chapel Hill.

Examples

# The Mozambique data
res <- ggb(Moz)
res
# The Brasil data
BM <- ggb(BrasilMales)
BF <- ggb(BrasilFemales)
head(BM)
head(BF)

[Package DDM version 1.0-0 Index]