R: Calculate disproportionate impact per the percentage point...

di_ppg {DisImpact}

R Documentation

Calculate disproportionate impact per the percentage point gap (PPG) method.

Description

Calculate disproportionate impact per the percentage point gap (PPG) method.

Usage

di_ppg(
  success,
  group,
  cohort,
  weight,
  reference = c("overall", "hpg", "all but current", unique(group)),
  data,
  min_moe = 0.03,
  use_prop_in_moe = FALSE,
  prop_sub_0 = 0.5,
  prop_sub_1 = 0.5,
  check_valid_reference = TRUE
)

Arguments

`success`	A vector of success indicators (`1`/`0` or `TRUE`/`FALSE`) or an unquoted reference (name) to a column in `data` if it is specified. It could also be a vector of counts, in which case `weight` (group size) should also be specified.
`group`	A vector of group names of the same length as `success` or an unquoted reference (name) to a column in `data` if it is specified.
`cohort`	(Optional) A vector of cohort names of the same length as `success` or an unquoted reference (name) to a column in `data` if it is specified. Disproportionate impact is calculated for every group within each cohort. When `cohort` is not specified, then the analysis assumes a single cohort.
`weight`	(Optional) A vector of case weights of the same length as `success` or an unquoted reference (name) to a column in `data` if it is specified. If `success` consists of counts instead of success indicators (1/0), then `weight` should also be specified to indicate the group size.
`reference`	Either `'overall'` (default), `'hpg'` (highest performing group), `'all but current'` (success rate of everyone excluding the comparison group; also known as 'ppg minus 1'), a value from `group` (specifying a reference group), a single proportion (eg, 0.50), or a vector of proportions (one for each cohort). Reference is used as a point of comparison for disproportionate impact for each group. When `cohort` is specified: `'overall'` will use the overall success rate of each cohort group as the reference; `'hpg'` will use the highest performing group in each cohort as reference; `'all but current'` will use the calculated success rate of each cohort group excluding the comparison group the success rate of the specified reference group from `group` in each cohort will be used; the specified proportion will be used for all cohorts; the specified vector of proportions will refer to the reference point for each cohort in alphabetical order (so the number of proportions should equal to the number of unique cohorts).
`data`	(Optional) A data frame containing the variables of interest. If `data` is specified, then `success`, `group`, and `cohort` will be searched within it.
`min_moe`	The minimum margin of error (MOE) to be used in the calculation of disproportionate impact and is passed to ppg_moe. Defaults to `0.03`.
`use_prop_in_moe`	A logical value indicating whether or not the MOE formula should use the observed success rates (`TRUE`). Defaults to `FALSE`, which uses 0.50 as the proportion in the MOE formula. If `TRUE`, the success rates are passed to the `proportion` argument of ppg_moe.
`prop_sub_0`	For cases where `proportion` is 0, substitute with `prop_sub_0` (defaults to 0.5) to account for the zero MOE. This is relevant only when `use_prop_in_moe=TRUE`.
`prop_sub_1`	For cases where `proportion` is 1, substitute with `prop_sub_1` (defaults to 0.5) to account for the zero MOE. This is relevant only when `use_prop_in_moe=TRUE`.
`check_valid_reference`	Check whether `reference` is a valid value; defaults to `TRUE`. This argument exists to be used in di_iterate as when iterating DI calculations, there may be some scenarios where a specified reference group does not contain any students.

Details

This function determines disproportionate impact based on the percentage point gap (PPG) method, as described in this reference from the California Community Colleges Chancellor's Office. It assumes that a higher rate is good ("success"). For rates that are deemed negative (eg, rate of drop-outs, high is bad), then consider looking at the converse of the non-success (eg, non drop-outs, high is good) instead in order to leverage this function properly. Note that the margin of error (MOE) is calculated using using 1.96*sqrt(0.25^2/n), with a min_moe used as the minimum by default.

Value

A data frame consisting of:

cohort (if used),
group,
n (sample size),
success (number of successes for the cohort-group),
pct (proportion of successes for the cohort-group),
reference_group (reference group used in DI calculation),
reference (reference value used in DI calculation),
moe (margin of error),
pct_lo (lower 95% confidence limit for pct),
pct_hi (upper 95% confidence limit for pct),
di_indicator (1 if there is disproportionate impact, ie, when pct_hi <= reference),
success_needed_not_di (the number of additional successes needed in order to no longer be considered disproportionately impacted as compared to the reference), and
success_needed_full_parity (the number of additional successes needed in order to achieve full parity with the reference).

References

California Community Colleges Chancellor's Office (2017). Percentage Point Gap Method.

Examples

library(dplyr)
data(student_equity)
# Vector
di_ppg(success=student_equity$Transfer
  , group=student_equity$Ethnicity) %>% as.data.frame
# Tidy and column reference
di_ppg(success=Transfer, group=Ethnicity, data=student_equity) %>%
  as.data.frame
# Cohort
di_ppg(success=Transfer, group=Ethnicity, cohort=Cohort
 , data=student_equity) %>%
  as.data.frame
# With custom reference (single)
di_ppg(success=Transfer, group=Ethnicity, reference=0.54
  , data=student_equity) %>%
  as.data.frame
# With custom reference (multiple)
di_ppg(success=Transfer, group=Ethnicity, cohort=Cohort
  , reference=c(0.5, 0.55), data=student_equity) %>%
  as.data.frame
# min_moe
di_ppg(success=Transfer, group=Ethnicity, data=student_equity
  , min_moe=0.02) %>%
  as.data.frame
# use_prop_in_moe
di_ppg(success=Transfer, group=Ethnicity, data=student_equity
  , min_moe=0.02
  , use_prop_in_moe=TRUE) %>%
  as.data.frame

[Package DisImpact version 0.0.21 Index]