R: Calculates several imbalance measures

imbalance {cem}

R Documentation

Calculates several imbalance measures

Description

Calculates several imbalance measures for the original and matched data sets

Usage

imbalance(group, data, drop=NULL, breaks = NULL, weights, grouping = NULL)

Arguments

`group`	the group variable
`data`	the data
`drop`	a vector of variable names in the data frame to ignore
`breaks`	a list of vectors of cutpoints used to calculate the L1 measure. See Details.
`weights`	weights
`grouping`	named list, each element of which is a list of groupings for a single categorical variable. See Details.

Details

This function calculates several imbalance measures. For numeric variables, the difference in means (under the column statistic), the difference in quantiles and the L1 measure is calculated. For categorical variables the L1 measure and the Chi-squared distance (under column statistic) is calculated. Column type reports either (diff) or (Chi2) to indicate the type of statistic being calculated.

If breaks is not specified, the Scott automated bin calculation is used (which coarsens less than Sturges, which used in cem). Please refer to cem help page. In this case, breaks are used to calculate the L1 measure.

This function also calculate the global L1 imbalance measure. If breaks is missing, the default rule to calculate cutpoints is the Scott's rule.

The grouping option is a list where each element is itself a list. For example, suppose for variable quest1 you have the following possible levels "no answer", NA, "negative", "neutral", "positive" and you want to collect ("no answer", NA, "neutral") into a single group, then the grouping argument should contain list(quest1=list(c("no answer", NA, "neutral"))). Or if you have a discrete variable elements with values 1:10 and you want to collect it into groups “1:3,NA”, “4”, “5:9”, “10” you specify in grouping the following list list(elements=list(c(1:3,NA), 5:9)). Values not defined in the grouping are left as they are. If cutpoints and groupings are defined for the same variable, the groupings take precedence and the corresponding cutpoints are set to NULL.

See L1.meas help page for details.

Value

An object of class imbalance which is a list with the following two elements

`tab`	Table of imbalance measures
`L1`	The global L1 measure of imbalance

Author(s)

Stefano Iacus, Gary King, and Giuseppe Porro

References

Iacus, King, Porro (2011) doi:10.1198/jasa.2011.tm09599

Iacus, King, Porro (2012) doi:10.1093/pan/mpr013

Iacus, King, Porro (2019) doi:10.1017/pan.2018.29

Examples

 


data(LL)

todrop <- c("treated","re78")
   
imbalance(LL$treated, LL, drop=todrop)

# cem match: automatic bin choice
mat <- cem(treatment="treated", data=LL, drop="re78")

[Package cem version 1.1.31 Index]