matchMulti {matchMulti} | R Documentation |
A function that performs multilevel matching.
Description
This is the workhorse function in the package which matches groups and units within groups. For example, it will match both schools and students in schools, where the goal is to make units more comparable to estimate treatment effects.
Usage
matchMulti(
data,
treatment,
school.id,
match.students = TRUE,
student.vars = NULL,
school.caliper = NULL,
school.fb = NULL,
verbose = FALSE,
keep.target = NULL,
student.penalty.qtile = 0.05,
min.keep.pctg = 0.8,
school.penalty = NULL,
save.first.stage = TRUE,
tol = 10,
solver = "rlemon"
)
Arguments
data |
A data frame for use in matching. |
treatment |
Name of covariate that defines treated and control groups. |
school.id |
Identifier for groups (for example schools) |
match.students |
Logical flag for whether units within groups should
also be matched. If set to |
student.vars |
Names of student level covariates on which to measure
balance. School-level distances will be penalized when student mathces are
imbalanced on these variables. In addition, when |
school.caliper |
matrix with one row for each treated school and one
column for each control school, containing zeroes for pairings allowed by
the caliper and |
school.fb |
A list of discrete group-level covariates on which to enforce fine balance, i.e., ensure marginal distributions are balanced. First group is most important, second is second most, etc. If a simple list of variable names, one group is assumed. A list of list will give this hierarchy. |
verbose |
Logical flag for whether to give detailed output. |
keep.target |
an optional numeric value specifying the number of treated schools desired in the final match. |
student.penalty.qtile |
This helps exclude students if they are difficult to match. Default is 0.05, which implies that in the match we would prefer to exclude students rather than match them at distances larger than this quantile of the overall student-student robust Mahalanobis distance distribution |
min.keep.pctg |
Minimum percentage of students (from smaller school) to keep when matching students in each school pair. |
school.penalty |
A penalty to remove groups (schools) in the group (school) match |
save.first.stage |
Should first stage matches be saved. |
tol |
a numeric tolerance value for comparing distances, used in the school match. It may need to be raised above the default when matching with many levels of refined balance or in very large problems (when these distances will often be at least on the order of the tens of thousands). |
solver |
Name of package used to solve underlying network flow problem for the school match, one of 'rlemon' and 'rrelaxiv'. rrelaxiv carries an academic license and is not hosted on CRAN so it must be installed separately. |
Details
matchMulti
first matches students (or other individual units) within
each pairwise combination of schools (or other groups); based on these
matches a distance matrix is generated for the schools. Then schools are
matched on this distance matrix and the student matches for the selected
school pairs are combined into a single matched sample.
School covariates are not used to compute the distance matrix for schools
(since it is generated from the student match). Instead imbalances in school
covariates should be addressed through theschool.fb
argument, which
encodes a refined covariate balance constraint. School covariates in
school.fb
should be given in order of priority for balance, since the
matching algorithm optimally balances the variables in the first list
element, then attempts to further balance the those in the second element,
and so on.
Value
raw |
The unmatched data before matching. |
matched |
The matched dataset of both units and groups. Outcome analysis and balance checks are peformed on this item. |
school.match |
Object with two parts. The first lists which treated groups (schools) are matched to which control groups. The second lists the population of groups used in the match. |
school.id |
Name of school identifier |
treatment |
Name of treatment variable |
Author(s)
Luke Keele, Penn State University, ljk20@psu.edu
Sam Pimentel, University of California, Berkeley, spi@berkeley.edu
See Also
See also matchMulti
, matchMultisens
,
balanceMulti
, matchMultioutcome
,
rematchSchools
Examples
#toy example with short runtime
library(matchMulti)
#Load Catholic school data
data(catholic_schools)
# Trim data to speed up example
catholic_schools <- catholic_schools[catholic_schools$female_mean >.45 &
catholic_schools$female_mean < .60,]
#match on a single covariate
student.cov <- c('minority')
match.simple <-
matchMulti(catholic_schools, treatment = 'sector',
school.id = 'school', match.students = FALSE,
student.vars = student.cov, verbose=TRUE, tol=.01)
#Check balance after matching - this checks both student and school balance
balanceMulti(match.simple, student.cov = student.cov)
## Not run:
#larger example
data(catholic_schools)
student.cov <- c('minority','female','ses')
# Check balance student balance before matching
balanceTable(catholic_schools[c(student.cov,'sector')], treatment = 'sector')
#Match schools but not students within schools
match.simple <- matchMulti(catholic_schools, treatment = 'sector',
school.id = 'school', match.students = FALSE)
#Check balance after matching - this checks both student and school balance
balanceMulti(match.simple, student.cov = student.cov)
#Estimate treatment effect
output <- matchMultioutcome(match.simple, out.name = "mathach",
schl_id_name = "school", treat.name = "sector")
# Perform sensitivity analysis using Rosenbaum bound -- increase Gamma to increase effect of
# possible hidden confounder
matchMultisens(match.simple, out.name = "mathach",
schl_id_name = "school",
treat.name = "sector", Gamma = 1.3)
# Now match both schools and students within schools
match.out <- matchMulti(catholic_schools, treatment = 'sector',
school.id = 'school', match.students = TRUE, student.vars = student.cov)
# Check balance again
bal.tab <- balanceMulti(match.out, student.cov = student.cov)
# Now match with fine balance constraints on whether the school is large
# or has a high percentage of minority students
match.fb <- matchMulti(catholic_schools, treatment = 'sector', school.id = 'school',
match.students = TRUE, student.vars = student.cov,
school.fb = list( c('size_large'), c('minority_mean_large') )
# Estimate treatment effects
matchMultioutcome(match.fb, out.name = "mathach", schl_id_name = "school", treat.name = "sector")
#Check Balance
balanceMulti(match.fb, student.cov = student.cov)
## End(Not run)