R: Estimate Ciscato, Galichon and Gousse's model

estimate.affinity.matrix.unipartite {affinitymatrix}

R Documentation

Estimate Ciscato, Galichon and Gousse's model

Description

This function estimates the affinity matrix of the matching model of Ciscato Gousse and Galichon (2020), performs the saliency analysis and the rank tests. The user must supply a matched sample that is treated as the equilibrium matching of a bipartite one-to-one matching model without frictions and with Transferable Utility. The model differs from the original Dupuy and Galichon (2014) since all agents are pooled in one group and can match within the group. For the sake of clarity, in the documentation we take the example of the same-sex marriage market and refer to "first partner" and "second partner" in order to distinguish between the arbitrary partner order in a database (e.g., survey respondent and partner of the respondent). Note that in this case the variable "sex" is treated as a matching variable rather than a criterion to assign partners to one side of the market as in the bipartite case. Other applications may include matching between coworkers, roommates or teammates.

Usage

estimate.affinity.matrix.unipartite(
  X,
  Y,
  w = rep(1, N),
  A0 = matrix(0, nrow = K, ncol = K),
  lb = matrix(-Inf, nrow = K, ncol = K),
  ub = matrix(Inf, nrow = K, ncol = K),
  pr = 0.05,
  max_iter = 10000,
  tol_level = 1e-06,
  scale = 1,
  nB = 2000,
  verbose = TRUE
)

Arguments

`X`	The matrix of traits of the first partner. Its rows must be ordered so that the i-th individual in `X` is matched with the i-th partner in `Y`: this means that `nrow(X)` must be equal to `nrow(Y)`. Its columns correspond to the different matching variables: `ncol(X)` must be equal to `ncol(Y)` and the variables must be sorted in the same way in both matrices. The matrix is demeaned and rescaled before the start of the estimation algorithm.
`Y`	The matrix of traits of the second partner. Its rows must be ordered so that the i-th individual in `Y` is matched with the i-th partner in `X`: this means that `nrow(Y)` must be equal to `nrow(X)`. Its columns correspond to the different matching variables: `ncol(Y)` must be equal to `ncol(X)` and the variables must be sorted in the same way in both matrices. The matrix is demeaned and rescaled before the start of the estimation algorithm.
`w`	A vector of sample weights with length `nrow(X)`. Defaults to uniform weights.
`A0`	A vector or matrix with `ncol(X)*ncol(Y)` elements corresponding to the initial values of the affinity matrix to be fed to the estimation algorithm. Optional. Defaults to a matrix of zeros.
`lb`	A vector or matrix with `ncol(X)*ncol(Y)` elements corresponding to the lower bounds of the elements of the affinity matrix. Defaults to `-Inf` for all parameters.
`ub`	A vector or matrix with `ncol(X)*ncol(Y)` elements corresponding to the upper bounds of the elements of the affinity matrix. Defaults to `Inf` for all parameters.
`pr`	A probability indicating the significance level used to compute bootstrap two-sided confidence intervals for `U`, `V` and `lambda`. Defaults to 0.05.
`max_iter`	An integer indicating the maximum number of iterations in the Maximum Likelihood Estimation. See `optim` for the `"L-BFGS-B"` method. Defaults to 10000.
`tol_level`	A positive real number indicating the tolerance level in the Maximum Likelihood Estimation. See `optim` for the `"L-BFGS-B"` method. Defaults to 1e-6.
`scale`	A positive real number indicating the scale of the model. Defaults to 1.
`nB`	An integer indicating the number of bootstrap replications used to compute the confidence intervals of `U`, `V` and `lambda`. Defaults to 2000.
`verbose`	If `TRUE`, the function displays messages to keep track of its progress. Defaults to `TRUE`.

Value

The function returns a list with elements: X, the demeaned and rescaled matrix of traits of the first partner; Y, the demeaned and rescaled matrix of traits of the second partner; fx, the empirical marginal distribution of first partners; fy, the empirical marginal distribution of second partners; Aopt, the estimated affinity matrix; sdA, the standard errors of Aopt; tA, the Z-test statistics of Aopt; VarCovA, the full variance-covariance matrix of Aopt; rank.tests, a list with all the summaries of the rank tests on Aopt; U, whose columns are the left-singular vectors of Aopt; V, whose columns are the right-singular vectors of Aopt; lambda, whose elements are the singular values of Aopt; UCI, whose columns are the lower and the upper bounds of the confidence intervals of U; VCI, whose columns are the lower and the upper bounds of the confidence intervals of V; lambdaCI, whose columns are the lower and the upper bounds of the confidence intervals of lambda; df.bootstrap, a data frame resulting from the nB bootstrap replications and used to infer the empirical distribution of the estimated objects.

Examples


# Parameters
K = 4 # number of matching variables
N = 100 # sample size
mu = rep(0, 2*K) # means of the data generating process
Sigma = matrix(c(1, -0.0992, 0.0443, -0.0246, -0.8145, 0.083, -0.0438,
    0.0357, -0.0992, 1, 0.0699, -0.0043, 0.083, 0.8463, 0.0699, -0.0129, 0.0443,
    0.0699, 1, -0.0434, -0.0438, 0.0699, 0.5127, -0.0383, -0.0246, -0.0043,
    -0.0434, 1, 0.0357, -0.0129, -0.0383, 0.6259, -0.8145, 0.083, -0.0438,
    0.0357, 1, -0.0992, 0.0443, -0.0246, 0.083, 0.8463, 0.0699, -0.0129, -0.0992,
    1, 0.0699, -0.0043, -0.0438, 0.0699, 0.5127, -0.0383, 0.0443, 0.0699, 1,
    -0.0434, 0.0357, -0.0129, -0.0383, 0.6259, -0.0246, -0.0043, -0.0434, 1),
               nrow=K+K) # (normalized) variance-covariance matrix of the
               # data generating process with a block symmetric structure
labels = c("Sex", "Age", "Educ.", "Black") # labels for matching variables

# Sample
data = MASS::mvrnorm(N, mu, Sigma) # generating sample
X = data[,1:K]; Y = data[,K+1:K] # men's and women's sample data
w = sort(runif(N-1)); w = c(w,1) - c(0,w) # sample weights

# Main estimation
res = estimate.affinity.matrix.unipartite(X, Y, w = w, nB = 500)

# Summarize results
show.affinity.matrix(res, labels_x = labels, labels_y = labels)
show.diagonal(res, labels = labels)
show.test(res)
show.saliency(res, labels_x = labels, labels_y = labels,
              ncol_x = 2, ncol_y = 2)
show.correlations(res, labels_x = labels, labels_y = labels,
                  label_x_axis = "First partner",
                  label_y_axis = "Second partner", ndims = 2)