L1.meas {cem} | R Documentation |
Evaluates L1 distance between multidimensional histograms
Description
Evaluates L1 distance between multidimensional histograms
Usage
L1.meas(group, data, drop=NULL, breaks = NULL, weights, grouping = NULL)
Arguments
group |
the group variable |
data |
the data |
drop |
a vector of variable names in the data frame to ignore |
breaks |
a list of vectors of cutpoints; if not specified, automatic choice will be made |
weights |
weights |
grouping |
named list, each element of which is a list of groupings for a single categorical variable. See Details. |
Details
This function calculates the L1 distance on the k-dimensional histogram in order to measure the level of imbalance in a matching solution.
If breaks
is not specified, the Scott automated bin calculation
is used (which coarsens less than Sturges, which used in
cem
). Please refer to cem
help page. In
this case, breaks are used to calculate the L1 measure.
When choosing breaks
for L1, a very fine coarsening (many cut
points) produces values of L1 close to 1. A very mild coarsening (very
fex cutpoints), is not able to discriminate, i.e. L1 close to 0
(particularly true when the number of observations is small with respect
to the number of continuous variables).
The grouping
option is a list where each element is itself a
list. For example, suppose for variable quest1
you have the
following possible levels "no answer", NA, "negative", "neutral",
"positive"
and you want to collect ("no answer", NA, "neutral")
into a single group, then the grouping
argument should contain
list(quest1=list(c("no answer", NA, "neutral")))
. Or if you have
a discrete variable elements
with values 1:10
and you want
to collect it into groups “1:3,NA
”, “4
”,
“5:9
”, “10
” you specify in grouping
the
following list list(elements=list(c(1:3,NA), 5:9))
. Values not
defined in the grouping
are left as they are. If cutpoints
and groupings
are defined for the same variable, the
groupings
take precedence and the corresponding cutpoints are set
to NULL
.
The L1.profile
function
shows how to compare matching solutions for any level of (i.e., without
regard to) coarsening.
This code also calculate the Local Common Support (LCS) measure, which is the proportion of non empty k-dimensional cells of the histogram which contain at least one observation per group.
Value
An object of class L1.meas
which is a list with the following fields
L1 |
The numerical value of the L1 measure |
breaks |
A list of cutpoints used to calculate the L1 measure |
LCS |
The numerical value of the Local Common Support proportion |
Author(s)
Stefano Iacus, Gary King, and Giuseppe Porro
References
Iacus, King, Porro (2011) doi:10.1198/jasa.2011.tm09599
Iacus, King, Porro (2012) doi:10.1093/pan/mpr013
Iacus, King, Porro (2019) doi:10.1017/pan.2018.29
Examples
data(LL)
set.seed(123)
L1.meas(LL$treated,LL, drop=c("treated","re78"))