BiProbitPartial {BiProbitPartial} R Documentation

## Bivariate probit with partial observability

### Description

`BiProbitPartial` estimates a bivariate probit with partial observability model.

The bivariate probit with partial observability model is defined as follows. Let i denote the ith observation which takes values from 1 to N, X[1] be a covariate matrix of dimension N x k[1], X[2] be a covariate matrix of dimension N x k[2], X[1i] be the ith row of X[1], X[2i] be the ith row of X[2], β[1] be a coefficient vector of length k[1] and β[2] be a coefficient vector of length k[2]. Define the latent response for stage one to be

y[1i]* = X[1i] β[1] + ε[1i]

and stage two to be

y[2i]* = X[2i] β[2] + ε[2i].

Note the stages do not need to occur sequentially. Define the outcome of the first stage to be y[1i] = 1 if y[1i]* > 0 and y[1i] = 0 if y[1i]* <= 0. Define the outcome of the second stage to be y[2i] = 1 if y[2i]* > 0 and y[2i] = 0 if y[2i]* <= 0. The observed outcome is the product of the outcomes from the two stages

z[i] = y[1i] y[2i].

The pair (ε[1i],ε[2i]) is distributed independently and identically multivariate normal with means E(ε[1i]) = E(ε[2i]) = 0, variances Var(ε[1i]) = Var(ε[2i]) = 1, and correlation (or equivalently covariance) Corr(ε[1i],ε[2i]) = ρ. A more general structural representation is presented in Poirier (1980).

The model can be estimated by Bayesian Markov Chain Monte Carlo (MCMC) or frequentist maximum likelihood methods. The correlation parameter ρ can be estimated or fixed. The MCMC algorithm used is a block Gibbs sampler within Metropolis-Hastings scheme developed by Rajbhandari (2014). The default maximum likelihood method is based off the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. A modification of the algorithm is used to include box constraints for when ρ is estimated. See optimr for details.

### Usage

```BiProbitPartial(formula, data, subset, na.action,
philosophy = "bayesian", control = list())
```

### Arguments

 `formula` an object of class Formula: a symbolic description of the model to be fitted. The details of model specification are given under 'Details'. `data` an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `BiProbitPartial` is called. `subset` an optional vector specifying a subset of observations to be used in the fitting process. `na.action` a function which indicates what should happen when the data contain `NA` observations. The default is set by the `na.action` setting of options, and is na.fail if that is unset. The 'factory-fresh' default is na.omit. Another possible value is `NULL`, no action. Value na.exclude can be useful. `philosophy` a character string indicating the philosophy to be used for estimation. For Bayesian MCMC estimation ```philosophy = "bayesian"``` should be used. For frequentist maximum likelihood estimation `philosophy = "frequentist"` should be used. The default is Bayesian MCMC estimation. `control` a list of control parameters. See 'Details'.

### Details

Models for `BiProbitPartial` are specified symbolically. A typical model has the form `response ~ terms1 | terms2` where `response` is the name of the (numeric binary) response vector and `terms1` and `terms2` are each a series of terms which specifies a linear predictor for latent response equations 1 and 2. A `terms1` specification of the form `first + second` indicates all the terms in `first` together with all the terms in `second` with duplicates removed. A specification of the form `first:second` indicates the set of terms obtained by taking the interactions of all terms in `first` with all terms in `second`. The specification `first*second` indicates the cross of `first` and `second`. This is the same as `first + second + first:second`. Likewise for `terms2`.

A Formula has an implied intercept term for both equations. To remove the intercept from equation 1 use either `response ~ terms1 - 1 | terms2` or `response ~ 0 + terms1 | terms2`. It is analgous to remove the intercept from the equation 2.

If `philosophy = "bayesian"` is specified then the model is estimated by MCMC methods based on Rajbhandari (2014). The prior for the parameters in equations 1 and 2 is multivariate normal with mean `beta0` and covariance `B0`. The prior for ρ is truncated normal on the interval [-1,1] with mean parameter `rho0` and variance parameter `v0` and is assumed to be apriori independent of the parameters in equations 1 and 2.

If `philosophy = "frequentist"` then the model is estimated by frequentist maximum likelihood using `optimr` from the package optimr.

The `control` argument is a list that can supply the tuning parameters of the Bayesian MCMC estimation and frequentist maximum likelihood estimation algorithms. For frequentist maximum likelihood the `control` argument is passed directly to `control` in the function `optimr` from the package optimr. If one wants to specify the `method` for the function `optimr` then `method` must be passed as an element of `control`. See optimr for further details.

The `control` argument can supply any of the following components for Bayesian MCMC estimation.

beta

Numeric vector or list of `nchains` elements each a numeric vector supplying starting values for the coefficients in equations 1 and 2. For each vector, the first k[1] values are for the coefficients in the first equation. The second k[2] values are for the coefficients in the second equation. Default is `beta = numeric( k1 + k2 )`, a vector of zeros.

rho

Numeric or list of `nchains` elements each a numeric starting value for ρ. Default is `rho = 0`.

fixrho

Logical value to determine if ρ is estimated. If `fixrho = TRUE` then ρ is fixed at value `rho`. Default is `fixrho = FALSE`.

S

Number of MCMC iterations. Default is `S = 1000`. For `philosophy = "bayesian"` only.

burn

Number of initial pre-thinning MCMC iterations to remove after estimation. Default is `burn = floor(S/2)`, the floor of the number of MCMC iterations divided by 2. For `philosophy = "bayesian"` only.

thin

Positive integer to keep every `thin` post-burn in MCMC draw and drop all others. Default is `thin = 1`, keep all post burn-in draws. For `philosophy = "bayesian"` only.

seed

Positive integer for `nchains = 1` or list of `nchains` elements each a positive integer fixing the seed of the random number generator. Typically used for replication. Default is `seed = NULL`, no seed. For `philosophy = "bayesian"` only.

nchains

Positive integer specifying the number of MCMC chains. Default is `nchains = 1`. For `philosophy = "bayesian"` only.

beta0

Numeric vector supplying the prior mean for the coefficients of equations 1 and 2. The first k[1] components are for the coefficients of equation 1. The second k[2] components are for the coefficients of equation 2. Default is `beta0 = numeric( k1 + k2 )`, a vector of zeros. For `philosophy = "bayesian"` only.

B0

Numeric matrix supplying the prior covariace of the parameters of equations 1 and 2. The first k[1] rows are for the parameters of equation 1. The second k[2] rows are for the parameters of equation 2. Likewise for columns. If unspecified the default is set such that the inverse of B0 is a zero matrix of dimension (k[1]+k[2]) x (k[1]+k[2]), a 'flat' prior. For `philosophy = "bayesian"` only.

rho0

Numeric value supplying a prior parameter for ρ which is the mean of a normal distribution that is truncated to the interval [-1,1]. Default is `rho0 = 0`. For `philosophy = "bayesian"` only.

v0

Numeric value supplying a prior parameter for ρ which is the variance of a normal distribution that is truncated to the interval [-1,1]. Default is `v0 = 1`. For `philosophy = "bayesian"` only.

nu

Numeric degrees of freedom parameter for setting the degrees of freedom for ρ's proposal t-distribution. Default is `nu = 10`.

tauSq

Numeric scaling parameter for scaling ρ's proposal t-distribution. Default is `tauSq = 1`.

P

Determines how aggressive proposal draws for ρ are. Set to `P = 0` normal or `P = -1` for aggresive. See Rajbhandari (2014) and for details. Default is `P = 0`. For ```philosophy = "bayesian"``` only.

trace

Numeric value determining the value of intermediate reporting. A negative value is no reporting, larger positive values provide higher degrees of reporting.

Note: If the Bayesian MCMC chains appear to not be converging and/or frequentist maximum likelihood produces errors with `summary`, the model may be unidentified. One possible solution is to add regressors to the first equation that are exluded from the second equation or visa-versa. See Poirier (1980) for more details.

### Value

`BiProbitPartial` returns an S x (k[1]+k[2]+1) x nchains array of MCMC draws of primary class `mcmc.list` and secondary class `BiProbitPartialb`, if `philosophy = "bayesian"`. Each element in the first dimension represents a MCMC draw. The first k[1] elements in the second dimension are draws for the coefficientss in the first equation. The next k[2] elements of the second dimension are draws for the coefficients in the second equation. The last element of the second dimension are draws for the correlation parameter. The elements of the third dimension are the chains. If ρ was fixed (`fixrho = TRUE`) then each draw for the last element in the second dimension is returned as the value it was fixed at (the starting value, `rho`).

If `philosophy = "frequentist"` a list equivalent to the output optimr with primary class `optimrml` and secondary class `BiProbitPartialf`.

### Author(s)

`BiProbitPartial` was written by Michael Guggisberg. The majority of the MCMC estimation was written by Amrit Romana based on Rajbhandari (2014). The development of this package was partially funded by the Institude for Defense Analyses (IDA).

### References

Poirier, Dale J. (1980). "Partial Observability in bivariate probit models" Journal of Econometrics 12, 209-217. (Identification)

Rajbhandari, Ashish (2014). "Identification and MCMC estimation of bivariate probit model with partial observability." Bayesian Inference in Social Sciences (eds I. Jeliazkov and X. Yang). (MCMC algorithm)

### Examples

```data('Mroz87',package = 'sampleSelection')
Mroz87\$Z = Mroz87\$lfp*(Mroz87\$wage >= 5)

f1 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ + exper + city,
data = Mroz87, philosophy = "frequentist")
summary(f1)

# Use the estimates from the frequenist philosophy as starting values
b1 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ + exper + city,
data = Mroz87, philosophy = "bayesian",
control = list(beta = f1\$par[1:(length(f1\$par)-1)], rho = tail(f1\$par,1)))
summary(b1)

## Not run: #The example used in the package sampleSelection is likely unidentified for
this model
f2 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ,
data = Mroz87, philosophy = "frequentist") #crashes
summary(f2) #crashes (f2 non-existent)

# Bayesian methods typically still work for unidentified models
b2 = BiProbitPartial(Z ~ educ + age + kids5 + kids618 + nwifeinc | educ,
data = Mroz87, philosophy = "bayesian",
control = list(beta = f1\$par[1:(length(f1\$par)-3)], rho = tail(f1\$par,1)))
summary(b2)

## End(Not run)

```

[Package BiProbitPartial version 1.0.3 Index]