OSDR {OSDR} | R Documentation |
Finds an Optimal System of Distinct Representatives from a family of subsets.
Description
The function finds an Optimal Set of Distinct Representatives (OSDR) from an ordered family of subsets. Optimality is order-wise in the sense specified by Gale: any other SDR cannot be larger and its elements cannot be chosen from more important sets (see details).
Usage
OSDR(M)
Arguments
M |
An ordered family of subsets. In assignment problems the list specifying the feasible applicants for each job. In matching problems the list specifying the controls matchable with each treated unit. The list is assumed ordered by decreasing assignment/matching priority. |
Details
Consider a set of jobs and a set of suitable applicants for each job. It was shown by D.Gale that there exists an optimally assignable subset of the jobs in the sense that if {a_1, \ldots a_k}
is optimal and {b_1, \ldots b_h}
is another assignable set of jobs then: a) k \ge h
and b) a_i \le b_i
for all i
. The latter means that job a_i
has as higher a priority as job b_i
. From a graph perspective, if J
is the set of jobs and A
is the set of applicants the OSDR is an order wise maximum size matching of the bipartite graph with vertex set J union A
.
Function OSDR
finds the optimal matching using an algorithmic proof of Hall's theorem due to logician D.J. Shoesmith. The algorithm assigns greedily until possible and correct backward when necessary. A message is given when a correction is attempted (when it fails another message warns that Hall's condition is not satisfied).
Value
A list containing three elements: the OSDR (a zero in position i indicates impossibility of assigning job i), the index of optimally assignable jobs and the index of unassignable jobs. If the latter is not empty M
does not meet Hall's condition, i.e., a complete SDR is not possible.
Note
In statistical matching problems the sets J
and A
are the sets of treated and control units and it is usually required to find the minimum covariate distance matching of treated versus control units. When a complete matching does not exist it can be convenient to find either an order optimal matching or a minimum cost matching on the OSDR
subset (see the gender gap example).
Author(s)
Massimo Cannas <massimo.cannas@unica.it>
References
Gale, D. (1968) Optimal matching in an ordered set: an application of matroid theory. Journal of Combinatorial Theory 4: 176-180.
Anderson, I. (1989) A first course in Combinatorial Mathematics. Oxford University Press.
See Also
Examples
### example 1
# M is the list of suitable applicants for five jobs
M1 <- c("A" , "B")
M2 <- c("B" , "C")
M3 <- c("B")
M4 <- c("A" , "C")
M5 <- c("B" , "C" , "D")
M <- list(M1 , M2 , M3 , M4 , M5)
OSDR(M)
# $OSDR
# [1] "A" "C" "B" "0" "D"
# $matched
# [1] 1 2 3 5
# $unmatched
# [1] 4
# job 4 unmatched so Hall's condition is not satisfied: it's impossible to fill all the jobs
# note that there are (order-\emph{suboptimal}) assignments of the same length of the optimal:
# eg: 0CBAD , BC0AD
#### example 2: sligthly modified: more than one order optimal matching
M1<-c("A","B","C")
M2<-c("A","C")
M3<-c("B")
M4<-c("A","C")
M5<-c("A","D")
M <-list(M1,M2,M3,M4,M5)
OSDR(M)
# note there are other order optimal matchings: ACB0D or CAB0D
# note there are also other maximum size matchings (not order optimal):
# e.g. 0CBAD or BC0AD