pclm2D {ungroup}R Documentation

Two-Dimensional Penalized Composite Link Model (PCLM-2D)

Description

Fit two-dimensional penalized composite link model (PCLM-2D), e.g. simultaneous ungrouping of age-at-death distributions grouped in age classes for adjacent years. The PCLM can be extended to a two-dimensional regression problem. This is particularly suitable for mortality analysis when mortality surfaces are to be estimated to capture both age-specific trajectories of coarsely grouped distributions and time trends (Rizzi et al. 2019).

Usage

pclm2D(
  x,
  y,
  nlast,
  offset = NULL,
  out.step = 1,
  ci.level = 95,
  verbose = TRUE,
  control = list()
)

Arguments

x

Vector containing the starting values of the input intervals/bins. For example: if we have 3 bins [0,5), [5,10) and [10, 15), x will be defined by the vector: c(0, 5, 10).

y

data.frame with counts to be ungrouped. The number of rows should be equal with the length of x.

nlast

Length of the last interval. In the example above nlast would be 5.

offset

Optional offset term to calculate smooth mortality rates. A vector of the same length as x and y. See Rizzi et al. (2015) for further details.

out.step

Length of estimated intervals in output. Values between 0.1 and 1 are accepted. Default: 1.

ci.level

Level of significance for computing confidence intervals. Default: 95.

verbose

Logical value. Indicates whether a progress bar should be shown or not. Default: TRUE.

control

List with additional parameters:

  • lambda – Smoothing parameter to be used in pclm estimation. If lambda = NA an algorithm will find the optimal values.

  • kr – Knot ratio. Number of internal intervals used for defining 1 knot in B-spline basis construction. See MortSmooth_bbase.

  • deg – Degree of the splines needed to create equally-spaced B-splines basis over an abscissa of data.

  • int.lambda – If lambda is optimized an interval to be searched needs to be specified. Format: vector containing the end-points.

  • diff – An integer indicating the order of differences of the components of PCLM coefficients.

  • opt.method – Selection criterion of the model. Possible values are "AIC" and "BIC".

  • max.iter – Maximal number of iterations used in fitting procedure.

  • tol – Relative tolerance in PCLM fitting procedure.

Value

The output is a list with the following components:

input

A list with arguments provided in input. Saved for convenience.

fitted

The fitted values of the PCLM model.

ci

Confidence intervals around fitted values.

goodness.of.fit

A list containing goodness of fit measures: standard errors, AIC and BIC.

smoothPar

Estimated smoothing parameters: lambda, kr and deg.

bins.definition

Additional values to identify the bins limits and location in input and output objects.

deep

A list of objects created in the fitting process. Useful in diagnosis of possible issues.

call

An unevaluated function call, that is, an unevaluated expression which consists of the named function applied to the given arguments.

References

Rizzi S, Gampe J, Eilers PHC (2015). “Efficient Estimation of Smooth Distributions From Coarsely Grouped Data.” American Journal of Epidemiology, 182(2), 138-147. doi:10.1093/aje/kwv020.

Rizzi S, Halekoh U, Thinggaard M, Engholm G, Christensen N, Johannesen TB, Lindahl-Jacobsen R (2019). “How to estimate mortality trends from grouped vital statistics.” International Journal of Epidemiology, 48(2), 571–582. doi:10.1093/ije/dyy183.

See Also

control.pclm2D plot.pclm2D

Examples

# Input data
Dx <- ungroup.data$Dx
Ex <- ungroup.data$Ex

# Aggregate data to be ungrouped in the examples below
# Select a 10y data frame
x      <- c(0, 1, seq(5, 85, by = 5))
nlast  <- 26
n      <- c(diff(x), nlast)
group  <- rep(x, n)
y      <- aggregate(Dx, by = list(group), FUN = "sum")[, 2:10]
offset <- aggregate(Ex, by = list(group), FUN = "sum")[, 2:10]

# Example 1 ---------------------- 
# Fit model and ungroup data using PCLM-2D
P1 <- pclm2D(x, y, nlast)
summary(P1)

# Plot fitted values
plot(P1)

# Plot input data
plot(P1, "observed")

# NOTE: pclm2D does not search for optimal smoothing parameters by default
# (like pclm does) because it is more time consuming. If optimization is 
# required set lambda = c(NA, NA):

P1 <- pclm2D(x, y, nlast, control = list(lambda = c(NA, NA)))

# Example 2 ---------------------- 
# Ungroup and build a mortality surface
P2 <- pclm2D(x, y, nlast, offset)
summary(P2)

plot(P2, type = "observed")
plot(P2, type = "fitted")
plot(P2, type = "fitted", colors = c("blue", "red"))

[Package ungroup version 1.4.4 Index]