pclm {ungroup} | R Documentation |
Univariate Penalized Composite Link Model (PCLM)
Description
Fit univariate penalized composite link model (PCLM) to ungroup binned count data, e.g. age-at-death distributions grouped in age classes.
Usage
pclm(
x,
y,
nlast,
offset = NULL,
out.step = 1,
ci.level = 95,
verbose = FALSE,
control = list()
)
Arguments
x |
Vector containing the starting values of the input intervals/bins.
For example: if we have 3 bins |
y |
Vector with counts to be ungrouped. It must have the same dimension
as |
nlast |
Length of the last interval. In the example above |
offset |
Optional offset term to calculate smooth mortality rates. A vector of the same length as x and y. See Rizzi et al. (2015) for further details. |
out.step |
Length of estimated intervals in output. Values between 0.1 and 1 are accepted. Default: 1. |
ci.level |
Level of significance for computing confidence intervals.
Default: |
verbose |
Logical value. Indicates whether a progress bar should be
shown or not.
Default: |
control |
List with additional parameters:
|
Details
The PCLM method is based on the composite link model, which extends standard generalized linear models. It implements the idea that the observed counts, interpreted as realizations from Poisson distributions, are indirect observations of a finer (ungrouped) but latent sequence. This latent sequence represents the distribution of expected means on a fine resolution and has to be estimated from the aggregated data. Estimates are obtained by maximizing a penalized likelihood. This maximization is performed efficiently by a version of the iteratively reweighted least-squares algorithm. Optimal values of the smoothing parameter are chosen by minimizing Bayesian or Akaike's Information Criterion.
Value
The output is a list with the following components:
input |
A list with arguments provided in input. Saved for convenience. |
fitted |
The fitted values of the PCLM model. |
ci |
Confidence intervals around fitted values. |
goodness.of.fit |
A list containing goodness of fit measures: standard errors, AIC and BIC. |
smoothPar |
Estimated smoothing parameters: |
bins.definition |
Additional values to identify the bins limits and location in input and output objects. |
deep |
A list of objects created in the fitting process. Useful in diagnosis of possible issues. |
call |
An unevaluated function call, that is, an unevaluated expression which consists of the named function applied to the given arguments. |
References
Rizzi S, Gampe J, Eilers PHC (2015). “Efficient Estimation of Smooth Distributions From Coarsely Grouped Data.” American Journal of Epidemiology, 182(2), 138-147. doi:10.1093/aje/kwv020.
See Also
Examples
# Data
x <- c(0, 1, seq(5, 85, by = 5))
y <- c(294, 66, 32, 44, 170, 284, 287, 293, 361, 600, 998,
1572, 2529, 4637, 6161, 7369, 10481, 15293, 39016)
offset <- c(114, 440, 509, 492, 628, 618, 576, 580, 634, 657,
631, 584, 573, 619, 530, 384, 303, 245, 249) * 1000
nlast <- 26 # the size of the last interval
# Example 1 ----------------------
M1 <- pclm(x, y, nlast)
ls(M1)
summary(M1)
fitted(M1)
plot(M1)
# Example 2 ----------------------
# ungroup even in smaller intervals
M2 <- pclm(x, y, nlast, out.step = 0.5)
head(fitted(M1))
plot(M1, type = "s")
# Note, in example 1 we are estimating intervals of length 1. In example 2
# we are estimating intervals of length 0.5 using the same aggregate data.
# Example 3 ----------------------
# Do not optimise smoothing parameters; choose your own. Faster.
M3 <- pclm(x, y, nlast, out.step = 0.5,
control = list(lambda = 100, kr = 10, deg = 10))
plot(M3)
summary(M2)
summary(M3) # not the smallest BIC here, but sometimes is not important.
# Example 4 -----------------------
# Grouped x & grouped offset (estimate death rates)
M4 <- pclm(x, y, nlast, offset)
plot(M4, type = "s")
# Example 5 -----------------------
# Grouped x & ungrouped offset (estimate death rates)
ungroupped_Ex <- pclm(x, y = offset, nlast, offset = NULL)$fitted # ungroupped offset data
M5 <- pclm(x, y, nlast, offset = ungroupped_Ex)