R: Fit a spatial probit model.

ProbitSpatialFit {ProbitSpatial}

R Documentation

Fit a spatial probit model.

Description

Approximate likelihood estimation of the probit model with spatial autoregressive (SAR), spatial error (SEM), spatial autoregressive with autoregressive disturbances (SARAR).

Usage

ProbitSpatialFit(formula,data,W,
         DGP='SAR',method="conditional",varcov="varcov",
         M=NULL,control=list())

Arguments

`formula`	an object of class `formula`: a symbolic description of the model to be fitted.
`data`	the data set containing the variables of the model.
`W`	the spatial weight matrix of class `"dgCMatrix"`.
`DGP`	the data generating process of `data`: SAR, SEM, SARAR (Default is SAR).
`method`	the optimisation method: `"conditional"` or `"full-lik"` (Defaul is `"conditional"`, see Details).
`varcov`	the likelihood function is computed using the variance-covariance matrix (`"varcov"`) or the precision matrix (`"precision"`)? Default is `"varcov"`.
`M`	the second spatial weight matrix for SARAR models. Same class as W.
`control`	a list of control parameters. See Details.

Details

The estimation is based on the approximate value of the true likelihood of spatial probit models. The DGP of the spatial autoregressive model (SAR) model is the following

y = \rho Wy + X\beta + \epsilon,

where the disturbances \epsilon are iid standard normally distributed, W is a sparse spatial weight matrix and \rho is the spatial lag parameter. The variance of the error term is equal to \Sigma=\sigma^2((I_n-\rho W)^{-1}((I_n-\rho W)^{-1})^{t}). The DGP of the spatial error model (SEM) is as follows

y = X\beta+u,

u = \rho W u + \epsilon,

where the disturbances \epsilon are iid standard normally distributed, W is a sparse spatial weight matrix and \rho is the spatial error parameter. The variance of the error term is equal to \Sigma=\sigma^2((I_n-\rho W)^{-1}((I_n-\rho W )^{-1})^{t}). The DGP of the spatial autoregressive model with autoregressive disturbances (SARAR) is as follows

y = \rho Wy + X\beta + u,

u = \lambda M u + \epsilon,

where the disturbances \epsilon are iid standard normally distributed, W and M are two sparse spatial weight matrix, while \rho and \lambda are the spatial lag and spatial error parameters, respectively. The variance of the error term is equal to \Sigma=\sigma^2((I_n-\rho W)^{-1}(I_n-\lambda M)^{-1}((I_n-\lambda M)^{-1})^{t}((I_n-\rho W)^{-1})^{t}).

The approximation is inspired by the Mendell-Elston approximation of the multivariante normal probabilities (see References). It makes use of the Cholesky decomposition of the variance-covariance matrix \Sigma.

The ProbitSpatialFit command estimates the model by maximising the approximate log-likelihood. We propose two optimisation method:

"conditional":: it relies on a standard probit estimation which applies to the model estimated conditional on \rho.
"full-lik":: it minimises the full-log-likelihood using the analytical gradient functions (only available for SAR and SEM specification). The optimisation is performed by means of the optim function with method = "BFGS".

In both cases a "conditional" estimation is performed. If method="conditional", then ProbitSpatialFit returns the results of this first estimation. In case method="full-lik", the function tries to improve the log-likelihood by means of a further exploration around the value of the parameters found by the conditional step. The conditional step is usually very accurate and particularly fast. The second step is more time consuming and does not always improve the results of the first step. We dissuade the user from using the full-likelihood method for sample sizes bigger than ten thousands, since the computation of the gradients is quite slow. Simulation studies reported in Martinetti and Geniaux (2017) prove that the conditional estimation is highly reliable, even if compared to the full-likelihood ones.

In order to reduce the computation time of the function ProbitSpatialFit, we propose a variant of the likelihood-function estimation that uses the inverse of the variance-covariance matrix (a.k.a. precision matrix). This variant applies to both the "conditional" and the "full-lik" methods and can be invoked by setting varcov="precision". Simulation studies reported in Martinetti and Geniaux (2017) suggest that the accuracy of the results with the precision matrix are sometimes worst than the one with the true variance-covariance matrix, but the estimation time is considerably reduced.

The control argument is a list that can supply any of the following components:

iW_CL: the order of approximation of (I_n-\rho W)^{-1} used in the "conditional" method. Default is 6, while 0 means no approximation (it uses exact inversion of matrixes, not suitable for big sample sizes). See Martinetti and Geniaux (2017) for further references.
iW_FL: the order of approximation of (I_n-\rho W)^{-1} used in the computation of the likelihood function for the "full-lik" method. Default is 0, meaning no approximation.
iW_FG: the order of approximation of (I_n-\rho W)^{-1} used in the computation of the gradient functions for the "full-lik" method. Default is 0, meaning no approximation.
reltol: relative convergence tolerance. It represents tol in optimize function for method="conditional" and reltol in optim function for method="full-lik". Default is 1e-5.
prune: the pruning value used in the gradients. Default is 0, meaning no pruning. Typacl values are around 1e-3 and 1e-6. They help reducing the estimation time of the gradient functions.
silent: Default is TRUE.

Value

Return an object of class ProbitSpatial.

References

Mendell and Elston (1974): N. Mendell and R. Elston. Multifactorial qualitative traits: genetic analysis and prediction of recurrence risks. Biometrics 30, 41–57, 1974.
Martinetti and Geniaux (2017): D. Martinetti and G. Geniaux. Approximate likelihood estimation of spatial probit models. Regional Science and Urban Economics 64, 30-45, 2017.

Examples


n <- 1000
nneigh <- 3
rho <- 0.5
beta <- c(4,-2,1)
W <- generate_W(n,nneigh,seed=123)
X <- cbind(1,rnorm(n,2,2),rnorm(n,0,1))
colnames(X) <- c("intercept","X1","X2")
y <- sim_binomial_probit(W=W,X=X,beta=beta,rho=rho,model="SAR")
d <- as.data.frame(cbind(y,X))
mod <- ProbitSpatialFit(y~X1+X2,d,W,
       DGP='SAR',method="conditional",varcov="varcov")

[Package ProbitSpatial version 1.1 Index]