ageutils {ageutils} | R Documentation |
This help page documents the utility functions provided for working with individual ages and associated intervals:
breaks_to_interval()
takes a specified set of breaks representing the left
hand limits of a closed open interval, i.e [x, y), and returns the
corresponding interval and upper bounds. The resulting intervals span from
the minimum break through to Inf
.
cut_ages()
provides categorisation of ages based on specified breaks which
represent the left-hand interval limits. The resultant groupings will span
from the minimum break through to Inf
and will always be closed on the left
and open on the right. Ages below the minimum break will be returned as NA.
As an example, if breaks = c(0, 1, 10, 30)
the possible groupings would be
[0, 1), [1, 10), [10, 30) and [30, Inf). This is roughly comparable
to a call of cut(ages, right = FALSE, breaks = c(limits, Inf))
but with
both the resultant interval and the start and end points returned as entries
in a list.
split_interval_counts()
splits counts of a given age interval in to counts
for individual years based on a given weighting. Age intervals are specified
by their lower (closed) and upper (open) bounds, i.e. intervals of the form
[lower, upper).
aggregate_age_counts()
provides aggregation of counts across ages (in
years). It is similar to a cut()
and tapply()
pattern but optimised for
speed over flexibility. Groupings are the same as in ages_to_interval()
and counts will be provided across all natural numbers grater than the
minimum break. Missing values, and those less than the minimum break, are
grouped as NA.
reaggregate_interval_counts()
is equivalent to, but more efficient than,
a call to split_interval_counts()
followed by aggregate_age_counts()
.
breaks_to_interval(breaks)
cut_ages(ages, breaks)
split_interval_counts(
lower_bounds,
upper_bounds,
counts,
max_upper = 100L,
weights = NULL
)
aggregate_age_counts(counts, ages = 0:(length(counts) - 1L), breaks)
reaggregate_interval_counts(
lower_bounds,
upper_bounds,
counts,
breaks,
max_upper = 100L,
weights = NULL
)
breaks |
1 or more non-negative cut points in increasing (strictly) order. These correspond to the left hand side of the desired intervals (e.g. the closed side of [x, y). Double values are coerced to integer prior to categorisation. |
ages |
Vector of age in years. Double values are coerced to integer prior to categorisation / aggregation. For
|
lower_bounds , upper_bounds |
A pair of vectors representing the bounds of the intervals.
Missing (NA) bounds are not permitted. Double vectors will be coerced to integer. |
counts |
Vector of counts to be aggregated. |
max_upper |
Represents the maximum upper bounds permitted upon splitting the data. Used to replace If any Double vectors will be coerced to integer. |
weights |
Population weightings to apply for individual years. If If specified, must be of length |
breaks_to_interval()
and cut_ages()
:
A data frame with an ordered factor column (interval
), as well as columns
corresponding to the explicit bounds (lower_bound
and upper_bound
).
split_interval_counts()
:
A data frame with entries age
(in years) and count
.
aggregate_age_counts()
and reaggregate_interval_counts()
:
A data frame with 4 entries; interval
, lower_bound
, upper_bound
and an
associated count
.
cut_ages(ages = 0:9, breaks = c(0L, 3L, 5L, 10L))
cut_ages(ages = 0:9, breaks = 5L)
split_interval_counts(
lower_bounds = c(0, 5, 10),
upper_bounds = c(5, 10, 20),
counts = c(5, 10, 30)
)
# default ages generated if only counts provided (here ages will be 0:64)
aggregate_age_counts(counts = 1:65, breaks = c(0L, 1L, 5L, 15L, 25L, 45L, 65L))
aggregate_age_counts(counts = 1:65, breaks = 50L)
# NA ages are handled with their own grouping
ages <- 1:65;
ages[1:44] <- NA
aggregate_age_counts(
counts = 1:65,
ages = ages,
breaks = c(0L, 1L, 5L, 15L, 25L, 45L, 65L)
)
reaggregate_interval_counts(
lower_bounds = c(0, 5, 10),
upper_bounds = c(5, 10, 20),
counts = c(5, 10, 30),
breaks = c(0L, 1L, 5L, 15L, 25L, 45L, 65L)
)