di_iterate_sql {DisImpact}  R Documentation 
Iteratively calculate disproportionate impact using multiple methods for many variables, using SQL.
Description
Iteratively calculate disproportionate impact via the percentage point gap (PPG), proportionality index, and 80% index methods for many success variables, disaggregation variables, and scenarios, using SQL (for data stored in a database or in a parquet data file).
Usage
di_iterate_sql(
db_conn,
db_table_name,
success_vars,
group_vars,
cohort_vars = NULL,
scenario_repeat_by_vars = NULL,
exclude_scenario_df = NULL,
weight_var = NULL,
include_non_disagg_results = TRUE,
ppg_reference_groups = "overall",
min_moe = 0.03,
use_prop_in_moe = FALSE,
prop_sub_0 = 0.5,
prop_sub_1 = 0.5,
di_prop_index_cutoff = 0.8,
di_80_index_cutoff = 0.8,
di_80_index_reference_groups = "hpg",
check_valid_reference = TRUE,
parallel = FALSE,
parallel_n_cores = parallel::detectCores()/2,
mssql_flag = FALSE,
return_what = "data",
staging_table = paste0("DisImpact_Staging_", paste0(sample(1:9, size = 5, replace =
TRUE), collapse = "")),
drop_staging_table = TRUE
)
Arguments
db_conn 
A database connection object, returned by dbConnect. 
db_table_name 
A character value specifying a database table name. 
success_vars 
A character vector of success variable names to iterate across. 
group_vars 
A character vector of group (disaggregation) variable names to iterate across. 
cohort_vars 
(Optional) A character vector of the same length as 
scenario_repeat_by_vars 
(Optional) A character vector of variables to repeat DI calculations for across all combination of these variables. For example, the following variables could be specified:
Each combination of these variables (eg, full time, first time college students with an ed goal of degree/transfer as one combination) would constitute an iteration / sample for which to calculate disproportionate impact for outcomes listed in 
exclude_scenario_df 
(Optional) A data frame with variables that match 
weight_var 
(Optional) A character variable specifying the weight variable if the input data set is summarized (i.e., the the success variables specified in 
include_non_disagg_results 
A logical variable specifying whether or not the nondisaggregated results should be returned; defaults to 
ppg_reference_groups 
Either 
min_moe 
The minimum margin of error to be used in the PPG calculation; see di_ppg. 
use_prop_in_moe 
( 
prop_sub_0 
Default is 0.50; see di_ppg. 
prop_sub_1 
Default is 0.50; see di_ppg. 
di_prop_index_cutoff 
Threshold used for determining disproportionate impact using the proportionality index; see di_prop_index; defaults to 0.80. 
di_80_index_cutoff 
Threshold used for determining disproportionate impact using the 80% index; see di_80_index; defaults to 0.80. 
di_80_index_reference_groups 
Either 
check_valid_reference 
( 
parallel 
If 
parallel_n_cores 
The number of CPU cores to use if 
mssql_flag 
Userspecified logical flag ( 
return_what 
A character value specifying the return value for the function call. For 
staging_table 
A character value indicating the name of the staging or results table in the database for storing the disproportionate impact calculations. 
drop_staging_table 

Details
Iteratively calculate disproportionate impact via the percentage point gap (PPG), proportionality index, and 80% index methods for all combinations of success_vars
, group_vars
, and cohort_vars
, for each combination of subgroups specified by scenario_repeat_by_vars
, using SQL (calculations done on the database engine or duckdb for parquet files).
Value
When return_what='data'
(default), a long data frame is returned (see the return value for di_iterate). When return_what='SQL'
(default), a list object where each element is a query (character value) is returned.