weighted_quantile {ggdist} | R Documentation |
Weighted sample quantiles
Description
A variation of quantile()
that can be applied to weighted samples.
Usage
weighted_quantile(
x,
probs = seq(0, 1, 0.25),
weights = NULL,
n = NULL,
na.rm = FALSE,
names = TRUE,
type = 7,
digits = 7
)
weighted_quantile_fun(x, weights = NULL, n = NULL, na.rm = FALSE, type = 7)
Arguments
x |
numeric vector: sample values |
probs |
numeric vector: probabilities in |
weights |
Weights for the sample. One of:
|
n |
Presumed effective sample size. If this is greater than 1 and
continuous quantiles (
|
na.rm |
logical: if |
names |
logical: If |
type |
integer between 1 and 9: determines the type of quantile estimator to be used. Types 1 to 3 are for discontinuous quantiles, types 4 to 9 are for continuous quantiles. See Details. |
digits |
numeric: the number of digits to use to format percentages
when |
Details
Calculates weighted quantiles using a variation of the quantile types based
on a generalization of quantile()
.
Type 1–3 (discontinuous) quantiles are directly a function of the inverse CDF as a step function, and so can be directly translated to the weighted case using the natural definition of the weighted ECDF as the cumulative sum of the normalized weights.
Type 4–9 (continuous) quantiles require some translation from the definitions
in quantile()
. quantile()
defines continuous estimators in terms of
x_k
, which is the k
th order statistic, and p_k
, which is a function of k
and n
(the sample size). In the weighted case, we instead take x_k
as the k
th
smallest value of x
in the weighted sample (not necessarily an order statistic,
because of the weights). Then we can re-write the formulas for p_k
in terms of
F(x_k)
(the empirical CDF at x_k
, i.e. the cumulative sum of normalized
weights) and f(x_k)
(the normalized weight at x_k
), by using the
fact that, in the unweighted case, k = F(x_k) \cdot n
and 1/n = f(x_k)
:
- Type 4
p_k = \frac{k}{n} = F(x_k)
- Type 5
p_k = \frac{k - 0.5}{n} = F(x_k) - \frac{f(x_k)}{2}
- Type 6
p_k = \frac{k}{n + 1} = \frac{F(x_k)}{1 + f(x_k)}
- Type 7
p_k = \frac{k - 1}{n - 1} = \frac{F(x_k) - f(x_k)}{1 - f(x_k)}
- Type 8
p_k = \frac{k - 1/3}{n + 1/3} = \frac{F(x_k) - f(x_k)/3}{1 + f(x_k)/3}
- Type 9
p_k = \frac{k - 3/8}{n + 1/4} = \frac{F(x_k) - f(x_k) \cdot 3/8}{1 + f(x_k)/4}
Then the quantile function (inverse CDF) is the piece-wise linear function
defined by the points (p_k, x_k)
.
Value
weighted_quantile()
returns a numeric vector of length(probs)
with the
estimate of the corresponding quantile from probs
.
weighted_quantile_fun()
returns a function that takes a single argument,
a vector of probabilities, which itself returns the corresponding quantile
estimates. It may be useful when weighted_quantile()
needs to be called
repeatedly for the same sample, re-using some pre-computation.