R: Compute frequencies from (3 essential) probabilities.

comp_freq {riskyr}

R Documentation

Compute frequencies from (3 essential) probabilities.

Description

comp_freq computes frequencies (typically as rounded integers) given 3 basic probabilities – prev, sens, and spec – for a population of N individuals. It returns a list of 11 key frequencies freq as its output.

Usage

comp_freq(
  prev = num$prev,
  sens = num$sens,
  spec = num$spec,
  N = num$N,
  round = TRUE,
  sample = FALSE
)

Arguments

`prev`	The condition's prevalence `prev` (i.e., the probability of condition being `TRUE`).
`sens`	The decision's sensitivity `sens` (i.e., the conditional probability of a positive decision provided that the condition is `TRUE`).
`spec`	The decision's specificity value `spec` (i.e., the conditional probability of a negative decision provided that the condition is `FALSE`).
`N`	The number of individuals in the population. If `N` is unknown (`NA`), a suitable minimum value is computed by `comp_min_N`.
`round`	Boolean value that determines whether frequency values are rounded to the nearest integer. Default: `round = TRUE`. Note: Removed `n_digits` parameter: Number of digits to which frequency values are to be rounded when `round = FALSE`. Default: `n_digits = 5`.
`sample`	Boolean value that determines whether frequency values are sampled from `N`, given the probability values of `prev`, `sens`, and `spec`. Default: `sample = FALSE`. Note: Sampling uses `sample()` and returns integer values.

Details

In addition to prev, both sens and spec are necessary arguments. If only their complements mirt or fart are known, use the wrapper function comp_freq_prob which also accepts mirt and fart as inputs (but requires that the entire set of provided probabilities is sufficient and consistent). Alternatively, use comp_complement, comp_comp_pair, or comp_complete_prob_set to obtain the 3 essential probabilities.

comp_freq is the frequency counterpart to the probability function comp_prob.

By default, comp_freq and its wrapper function comp_freq_prob round frequencies to nearest integers to avoid decimal values in freq (i.e., round = TRUE by default). When frequencies are rounded, probabilities computed from freq may differ from exact probabilities. Using the option round = FALSE turns off rounding.

Key relationships between probabilities and frequencies:

Three perspectives on a population:

A population of N individuals can be split into 2 subsets of frequencies in 3 different ways:
1. by condition:
  
  N = cond_true + cond_false
  
  The frequency cond_true depends on the prevalence prev and the frequency cond_false depends on the prevalence's complement 1 - prev.
2. by decision:
  
  N = dec_pos + dec_neg
  
  The frequency dec_pos depends on the proportion of positive decisions ppod and the frequency dec_neg depends on the proportion of negative decisions 1 - ppod.
3. by accuracy (i.e., correspondence of decision to condition):
  
  N = dec_cor + dec_err
Each perspective combines 2 pairs of the 4 essential probabilities (hi, mi, fa, cr).

When providing probabilities, the population size N is a free parameter (independent of the essential probabilities prev, sens, and spec).

If N is unknown (NA), a suitable minimum value can be computed by comp_min_N.
Defining probabilities in terms of frequencies:

Probabilities are – determine, describe, or are defined as – the relationships between frequencies. Thus, they can be computed as ratios between frequencies:
1. prevalence prev:
  
  prev = cond_true/N = (hi + mi) / (hi + mi + fa + cr)
2. sensitivity sens:
  
  sens = hi/cond_true = hi / (hi + mi) = (1 - mirt)
3. miss rate mirt:
  
  mirt = mi/cond_true = mi / (hi + mi) = (1 - sens)
4. specificity spec:
  
  spec = cr/cond_false = cr / (fa + cr) = (1 - fart)
5. false alarm rate fart:
  
  fart = fa/cond_false = fa / (fa + cr) = (1 - spec)
6. proportion of positive decisions ppod:
  
  ppod = dec_pos/N = (hi + fa) / (hi + mi + fa + cr)
7. positive predictive value PPV:
  
  PPV = hi/dec_pos = hi / (hi + fa) = (1 - FDR)
8. negative predictive value NPV:
  
  NPV = cr/dec_neg = cr / (mi + cr) = (1 - FOR)
9. false detection rate FDR:
  
  FDR = fa/dec_pos = fa / (hi + fa) = (1 - PPV)
10. false omission rate FOR:
  
  FOR = mi/dec_neg = mi / (mi + cr) = (1 - NPV)
11. accuracy acc:
  
  acc = dec_cor/N = (hi + cr) / (hi + mi + fa + cr)
12. rate of hits, given accuracy p_acc_hi:
  
  p_acc_hi = hi/dec_cor = (1 - cr/dec_cor)
13. rate of false alarms, given inaccuracy p_err_fa:
  
  p_err_fa = fa/dec_err = (1 - mi/dec_err)
Beware of rounding and sampling issues! If frequencies are rounded (by round = TRUE in comp_freq) or sampled from probabilities (by sample = TRUE), then any probabilities computed from freq may differ from original and exact probabilities.

Functions translating between representational formats: comp_prob_prob, comp_prob_freq, comp_freq_prob, comp_freq_freq (see documentation of comp_prob_prob for details).

Value

A list freq containing 11 key frequency values.

Examples

comp_freq()          # ok, using current defaults
length(comp_freq())  # 11 key frequencies

# Rounding:
comp_freq(prev = .5, sens = .5, spec = .5, N = 1)   # yields fa = 1 (see ?round for reason)
comp_freq(prev = .1, sens = .9, spec = .8, N = 10)  # 1 hit (TP, rounded)
comp_freq(prev = .1, sens = .9, spec = .8, N = 10, round = FALSE)    # hi = .9
comp_freq(prev = 1/3, sens = 6/7, spec = 2/3, N = 1, round = FALSE)  # hi = 0.2857143

# Sampling (from probabilistic description):
comp_freq_prob(prev = .5, sens = .5, spec = .5, N = 100, sample = TRUE)  # freq values vary

# Extreme cases:
comp_freq(prev = 1, sens = 1, spec = 1, 100)  # ok, N hits (TP)
comp_freq(prev = 1, sens = 1, spec = 0, 100)  # ok, N hits
comp_freq(prev = 1, sens = 0, spec = 1, 100)  # ok, N misses (FN)
comp_freq(prev = 1, sens = 0, spec = 0, 100)  # ok, N misses
comp_freq(prev = 0, sens = 1, spec = 1, 100)  # ok, N correct rejections (TN)
comp_freq(prev = 0, sens = 1, spec = 0, 100)  # ok, N false alarms (FP)

# Watch out for:
comp_freq(prev = 1, sens = 1, spec = 1, N = NA)  # ok, but warning that N = 1 was computed
comp_freq(prev = 1, sens = 1, spec = 1, N =  0)  # ok, but all 0 + warning (extreme case: N hits)
comp_freq(prev = .5, sens = .5, spec = .5, N = 10, round = TRUE)   # ok, rounded (see mi and fa)
comp_freq(prev = .5, sens = .5, spec = .5, N = 10, round = FALSE)  # ok, not rounded

# Ways to fail:
comp_freq(prev = NA,  sens = 1, spec = 1,  100)   # NAs + warning (prev NA)
comp_freq(prev = 1,  sens = NA, spec = 1,  100)   # NAs + warning (sens NA)
comp_freq(prev = 1,  sens = 1,  spec = NA, 100)   # NAs + warning (spec NA)
comp_freq(prev = 8,  sens = 1,  spec = 1,  100)   # NAs + warning (prev beyond range)
comp_freq(prev = 1,  sens = 8,  spec = 1,  100)   # NAs + warning (sens beyond range)

[Package riskyr version 0.4.0 Index]