mvProbit {mvProbit}R Documentation

Estimation of Multivariate Probit Models

Description

Estimating multivariate probit models by the maximum likelihood method.

WARNING: this function is experimental and extremely (perhaps even unusably) slow!

Usage

mvProbit( formula, data, start = NULL, startSigma = NULL, 
   method = "BHHH", finalHessian = "BHHH", 
   algorithm = "GHK", nGHK = 1000, 
   intGrad = TRUE, oneSidedGrad = FALSE, eps = 1e-6, 
   random.seed = 123, ... )

## S3 method for class 'mvProbit'
print( x, digits = 4, ... )

Arguments

formula

a "formula": a symbolic description of the model (currently, all binary outcome variables must have the same regressors).

data

a data.frame containing the data.

start

an optional numeric vector specifying the starting values for the model coefficients; if argument startSigma is not specified, this vector can also include the correlation coefficients; the order of elements is explained in the section “details”; if this argument is not specified, coefficients estimated by univariate probit models are used as starting values for the model coefficients.

startSigma

optional starting values for the covariance/correlation matrix of the residuals (must be symmetric and have ones on its diagonal); if this argument is not specified and the starting values for the correlation coefficients are not included in argument start, the correlation matrix of the ‘response’ residuals, i.e. y - pnorm( X' beta ), is used as starting values for sigma.

method

maximisation method / algorithm (see maxLik).

finalHessian

Calculation of the final Hessian: either FALSE (no calculation of Hessian), TRUE (finite-distance calculation of Hessian), or "BHHH" (calculation based on information equality approach and finite-distance gradients, the default).

algorithm

algorithm for computing integrals of the multivariate normal distribution, either function GenzBretz(), Miwa(), or TVPACK() (see documentation of pmvnorm) or character string "GHK" (see documentation of ghkvec).

nGHK

numeric value specifying the number of simulation draws of the GHK algorithm for computing integrals of the multivariate normal distribution.

intGrad

logical. If TRUE, the computation of the gradients with respect to the estimated parameters is done internally in function mvProbitLogLik when it computes the log-likelihood values. If the optimization method requires gradients and this argument is FALSE, maxLik computes the gradients by numericGradient, which is usually slower than the calculation in function mvProbitLogLik. This argument should be set to FALSE if an optimisation algorithm is used that is not based on gradients.

oneSidedGrad

logical. If this argument and argument intGrad are both TRUE, the gradients of the log-likelihood function with respect to the estimated parameters are obtained by one-sided numeric finit-difference differentiation, which is faster but less precise than two-sided numeric finit-difference differentiation.

eps

numeric. The step size for the one-sided numeric finit-distance differentiation. Unfortunately, it is currently not possible to set the step size for the two-sided numeric finit-distance differentiation.

random.seed

an integer used to seed R's random number generator; this is to ensure replicability when computing (cumulative) probabilities of the multivariate normal distribution which is required to calculate the log likelihood values; set.seed( random.seed ) is called each time before a (cumulative) probability of the multivariate normal distribution is computed; defaults to 123.

x

object of class mvProbit (returned by mvProbit).

digits

positive integer specifiying the minimum number of significant digits to be printed (see print.default).

...

additional arguments to mvProbit are passed to maxLik and pmvnorm; additional arguments to print.mvProbit are currently ignored.

Details

It is possible to specify starting values (a) both for the model coefficients and the correlation coefficients (using argument start alone or arguments start and startSigma together), (b) only for the model coefficients (using argument start alone), or (c) only for the correlation coefficients (using argument startSigma alone).

If the model has n dependent variables (equations) and k explanatory variables in each equation, the order of the starting values in argument start must be as follows: b_{1,1}, ..., b_{1,k}, b_{2,1}, ..., b_{2,k}, ..., b_{n,1}, ..., b_{n,k}, where b_{i,j} is the coefficient of the jth explanatory variable in the ith equation. If argument startSigma is not specified, argument start can additionally include following elements: R_{1,2}, R_{1,3}, R_{1,4}, ..., R_{1,n}, R_{2,3}, R_{2,4}, ..., R_{2,n}, ..., R_{n-1,n}, where R_{i,j} is the correlation coefficient corresponding to the ith and jth equation.

The ‘state’ (or ‘seed’) of R's random number generator is saved at the beginning of the mvProbit function and restored at the end of this function so that this function does not affect the generation of random numbers outside this function although the random seed is set to argument random.seed and the calculation of the (cumulative) multivariate normal distribution uses random numbers.

Value

mvProbit returns an object of class "mvProbit" inheriting from class "maxLik". The returned object contains the same components as objects returned by maxLik and additionally the following components:

call

the matched call.

start

the vector of starting values.

nDep

the number of dependent variables.

nReg

the number of explanatory variables (regressors).

nObs

the number of observations.

dummyVars

vector of character strings indicating the names of explanatory variables that contain only zeros and ones or only TRUE and FALSE. It is NULL, if no explanatory variable is indentified as a dummy variable.

Author(s)

Arne Henningsen

References

Greene, W.H. (1996): Marginal Effects in the Bivariate Probit Model, NYU Working Paper No. EC-96-11. Available at https://www.ssrn.com/abstract=1293106.

See Also

mvProbitLogLik, mvProbitMargEff, probit, glm

Examples

## generate a simulated data set
set.seed( 123 )
# number of observations
nObs <- 50

# generate explanatory variables
xMat <- cbind( 
   const = rep( 1, nObs ),
   x1 = as.numeric( rnorm( nObs ) > 0 ),
   x2 = rnorm( nObs ) )

# model coefficients
beta <- cbind( c(  0.8,  1.2, -0.8 ),
               c( -0.6,  1.0, -1.6 ),
               c(  0.5, -0.6,  1.2 ) )

# covariance matrix of error terms
library( miscTools )
sigma <- symMatrix( c( 1, 0.2, 0.4, 1, -0.1, 1 ) )

# generate dependent variables
yMatLin <- xMat %*% beta 
yMat <- ( yMatLin + rmvnorm( nObs, sigma = sigma ) ) > 0
colnames( yMat ) <- paste( "y", 1:3, sep = "" )

# estimation (BHHH optimizer and GHK algorithm)
estResult <- mvProbit( cbind( y1, y2, y3 ) ~ x1 + x2,
   data = as.data.frame( cbind( xMat, yMat ) ), iterlim = 1, nGHK = 50 )
summary( estResult )

# same estimation with user-defined starting values
estResultStart <- mvProbit( cbind( y1, y2, y3 ) ~ x1 + x2,
   start = c( beta ), startSigma = sigma, 
   data = as.data.frame( cbind( xMat, yMat ) ), iterlim = 1, nGHK = 50 )
summary( estResultStart )

[Package mvProbit version 0.1-10 Index]