R: Calculate goodness-of-fit statistics for Revealed Preference...

gof {rpm}

R Documentation

Calculate goodness-of-fit statistics for Revealed Preference Matchings Model based on observed data

Description

gof.rpm ... It is typically based on the estimate from a rpm() call.

Usage

gof(object, ...)

## S3 method for class 'rpm'
gof(
  object,
  ...,
  empirical_p = TRUE,
  compare_sim = "sim-est",
  control = object$control,
  reboot = FALSE,
  verbose = FALSE
)

## S3 method for class 'gofrpm'
plot(x, ..., cex.axis = 0.7, main = "Goodness-of-fit diagnostics")

Arguments

`object`	list; an object of class`rpm` that is typically the result of a call to `rpm()`.
`...`	Additional arguments, to be passed to lower-level functions.
`empirical_p`	logical; (Optional) If TRUE the function returns the empirical p-value of the sample statistic based on `nsim` simulations
`compare_sim`	string; describes which two objects are compared to compute simulated goodness-of-fit statistics; valid values are `"sim-est"`: compares the marginal distribution of pairings in a simulated sample to the `rpm` model estimate of the marginal distribution based on that same simulated sample; `mod-est`: compares the marginal distribution of pairings in a simulated sample to the `rpm` model estimate used to generate the sample
`control`	A list of control parameters for algorithm tuning. Constructed using `control.rpm`, which should be consulted for specifics.
`reboot`	logical; if this is `TRUE`, the program will rerun the bootstrap at the coefficient values, rather than expect the object to contain a `bs.results` component with the bootstrap results run at the solution values. The latter is the default for `rpm` fits.
`verbose`	logical; if this is `TRUE`, the program will print out additional information, including data summary statistics.
`x`	a list, usually an object of class gofrpm
`cex.axis`	the magnification of the text used in axis notation;
`main`	Title for the goodness-of-fit plots.

Details

The function rpm is used to fit a revealed preference model for men and women of certain characteristics (or shared characteristics) of people of the opposite sex. The model assumes a one-to-one stable matching using an observed set of matchings and a set of (possibly dyadic) covariates to estimate the parameters for linear equations of utilities. It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).

The model represents the dyadic utility functions as deterministic linear utility functions of dyadic variables. These utility functions are functions of observed characteristics of the women and men. These functions are entered as terms in the function call to rpm. This function simulates from such a model.

Value

gof.rpm returns a list consisting of the following elements:

`observed_pmf`	numeric matrix giving observed probability mass distribution over different household types
`model_pmf`	numeric matrix giving expected probability mass distribution from `rpm` model
`obs_chi_sq`	the count-based observed chi-square statistic comparing marginal distributions of the population the data and the model estimate
`obs_chi_sq_cell`	the contribution to the observed chi-squared statistic by household type
`obs_kl`	the Kullback-Leibler (KL) divergence computed by comparing the observed marginal distributions to the expected marginal distribution based on the `rpm` model estimate
`obs_kl_cell`	the contribution to the observed KL divergence by household type
`empirical_p_chi_sq`	the proportion of simulated chi-square statistics that are greater than or equal to the observed chi-square statistic
`empirical_p_kl`	the proportion of simulated KL divergences that are greater than or equal to the observed KL divergence
`chi_sq_simulated`	vector of size `nsim` storing all simulated chi-square statistics
`kl_simulated`	vector of size `nsim` storing all simulated KL divergences
`chi_sq_cell_mean`	Mean contributions of each household type to the simulated chi_sq statistic
`chi_sq_cell_sd`	Standard deviation of the contributions of each household type to the simulated chi_sq statistics
`chi_sq_cell_median`	Median contributions of each household type to the simulated chi_sq statistic
`chi_sq_cell_iqr`	Interquartile range of the contributions of each household type to the simulated chi_sq statistics
`kl_cell_mean`	Mean contributions of each household type to the simulated KL divergences
`kl_cell_sd`	Standard deviation of the contributions of each household type to the simulated KL divergencesc
`kl_cell_median`	Median contributions of each household type to the simulated KL divergences
`kl_cell_iqr`	Interquartile range of the contributions of each household type to the simulated KL divergences

Methods (by class)

gof(rpm): Calculate goodness-of-fit statistics for Revealed Preference Matchings Model based on observed data

Functions

plot(gofrpm): plot.gofrpm plots diagnostics such empirical p-value based on chi-square statistics and KL divergences. See rpm for more information on these models.

References

Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031

Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054

Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299

Examples

library(rpm)

data(fauxmatching)
fit <- rpm(~match("edu") + WtoM_diff("edu",3),
          Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata,
          X_w="X_w", Z_w="Z_w",
          pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid",
          sampled="sampled")
a <- gof(fit)

[Package rpm version 0.7-3 Index]