Use a Penalty to Obtain a Near Exact Match

Description

A near-exact match (also known as an almost exact match) for a nominal covariate maximizes the number of pairs that are exactly matched for the nominal covariate. This is done by adding a large penalty to the covariate distance between individuals with different values of the nominal covariate. The topic is discussed in Section 9.2 of Design of Observational Studies (2010).

Usage

addalmostexact(dmat, z, f, mult = 10)


Arguments

 dmat A distance matrix with one row for each treated individual and one column for each control. Often, this is either the Mahalanobis distance based on covariates, mahal(), or else a robust variant produced by smahal(). z z is a vector that is 1 for a treated individual and 0 for a control. length(z) must equal sum(dim(dmat)). f A vector or factor giving the levels of the nominal covariate. length(f) must equal length(z). mult Let mx be the largest distance in dmat. Then mult*mx is added to each distance in dmat for which the treated and control individuals have different values of f. For large enough values of mult, this will ensure that a minimum distance match maximizes the number of individuals who are exactly matched for f.

Details

Very large values of mx may result in numerical instability or slow computations.

A small value of mx, say 1 or 1/10, may not maximize the number of exactly matched pairs, but it may usefully increase the number of exactly matched pairs.

Value

Returns the penalized distance matrix.

Author(s)

Paul R. Rosenbaum

References

Rosenbaum, P. R. (2010). Design of Observational Studies. New York: Springer. The method and example are discussed in Section 9.2.

Zubizarreta, J. R., Reinke, C. E., Kelz, R. R., Silber, J. H. and Rosenbaum, P. R. (2011). Matching for several sparse nominal variables in a case control study of readmission following surgery. The American Statistician, 65(4), 229-238. Combines near exact matching with fine balance for the same covariate.

Examples

data(costa)
z<-1*(costa$welder=="Y") aa<-1*(costa$race=="A")
smoker=1*(costa$smoker=="Y") age<-costa$age
x<-cbind(age,aa,smoker)
dmat<-mahal(z,x)
# Mahalanobis distances
round(dmat[,1:6],2)
# Mahalanobis distanced penalized for mismatching on smoking.
dmat<-addalmostexact(dmat, z, smoker, mult = 10)
# The first treated subject (row labeled 27) is a nonsmoker, but the
# third control (column 3) is a smoker, so there is a big penalty.
round(dmat[,1:6],2)


[Package DOS version 1.0.0 Index]