Generation of a random dataset with a spatial SUR structure.


The purpose of the function dgp_spsur is to generate a random dataset with the dimensions and spatial structure decided by the user. This function may be useful in pure simulation experiments or with the aim of showing specific properties and characteristics of a spatial SUR dataset and inferential procedures related to them.

The user of dgp_spsur should think in terms of a Monte Carlo experiment. The arguments of the function specify the dimensions of the dataset to be generated, the spatial mechanism underlying the data, the intensity of the SUR structure among the equations and the values of the parameters to be used to obtain the simulated data, which includes the error terms, the regressors and the explained variables.


dgp_spsur(Sigma, Tm = 1, G, N, Betas, Thetas = NULL, 
                 rho = NULL, lambda = NULL, p = NULL, listw = NULL, 
                 X = NULL, type = "matrix", pdfU = "nvrnorm", 
                 pdfX = "nvrnorm")



Covariance matrix between the G equations of the SUR model. This matrix should be definite positive and the user must check for that.


Number of time periods. Default = 1


Number of equations.


Number of cross-section or spatial units


A row vector of order (1xP)(1xP) showing the values for the beta coefficients. The first P1P_{1} terms correspond to the first equation (where the first element is the intercept), the second P2P_{2} terms to the coefficients of the second equation and so on.


Values for the θ\theta coefficients in the G equations of the model, when the type of spatial SUR model to be simulated is a "slx", "sdm" or "sdem". Thetas is a row vector of order 1xPTheta1xPTheta, where PThetas=pGPThetas=p-G; let us note that the intercept cannot appear among the spatial lags of the regressors. The first 1xKTheta11xKTheta_{1} terms correspond to the first equation, the second 1xPTheta21xPTheta_{2} terms correspond to the second equation, and so on. Default = NULL.


Values of the coefficients ρg;g=1,2,...,G\rho_{g}; g=1,2,..., G related to the spatial lag of the explained variable of the g-th equation. If rhorho is an scalar and there are G equations in the model, the same value will be used for all the equations. If rhorho is a row vector, of order (1xG), the function dgp_spsur will use these values, one for each equation. Default = NULL.


Values of the coefficients λg;g=1,2,...,G\lambda_{g}; g=1,2,..., G related to the spatial lag of the errors in the G equations. If lambdalambda is an scalar and there are G equations in the model, the same value will be used for all the equations. If lambdalambda is a row vector, of order (1xG), the function dgp_spsur will use these values, one for each equation of the spatial errors. Default = NULL.


Number of regressors by equation, including the intercept. p can be a row vector of order (1xG), if the number of regressors is not the same for all the equations, or a scalar, if the G equations have the same number of regressors.


A listw object created for example by nb2listw from spatialreg package; if nb2listw not given, set to the same spatial weights as the listw argument. It can also be a spatial weighting matrix of order (NxN) instead of a listw object. Default = NULL.


This argument tells the function dgp_spsur which X matrix should be used to generate the SUR dataset. If X is different from NULL, {dgp_spsur} will upload the X matrix selected in this argument. Note that the X must be consistent with the dimensions of the model. If X is NULL, dgp_spsur will generate the desired matrix of regressors from a multivariate Normal distribution with mean value zero and identity (PxP)(PxP) covariance matrix. As an alternative, the user may change this probability distribution function to the uniform case, U(0,1)U(0,1), through the argument pdfX. Default = NULL.


Selection of the type of output. The alternatives are matrix, df, panel, all. Default matrix


Multivariate probability distribution function, Mpdf, from which the values of the error terms will be drawn. The covariance matrix is the Σ\Sigma matrix specified by the user in the argument. Two alternatives "lognvrnorm", "nvrnorm". Default "nvrnorm".

Sigma. The function dgp_spsur provides two Mpdf, the multivariate Normal, which is the default, and the log-Normal distribution function which means just exponenciate the sampling drawn form a N(0,Σ)N(0,\Sigma) distribution. Default = "nvrnorm".


Multivariate probability distribution function (Mpdf), from which the values of the regressors will be drawn. The regressors are assumed to be independent. dgp_spsur provides two Mpdf, the multivariate Normal, which is the default, and the uniform in the interval U[0,1]U[0,1], using the dunif function. dunif, from the stats package. Two alternatives "nvrunif", "nvrnorm". Default "nvrnorm".


The purpose of the function dgp_spsur is to generate random datasets, of a SUR nature, with the spatial structure decided by the user. The function requires certain information to be supplied externally because, in fact, dgp_spsur constitutes a Data Generation Process, DGP. The following aspects should be addressed:

dgp_spsur provides two multivariate distribution functions, namely, the Normal and the log-Normal for the errors (the second should be taken as a clear departure from the standard assumption of normality). In both cases, random matrices of order (TmNxG) are obtained from a multivariate normal distribution, with a mean value of zero and the covariance matrix specified in the argument Sigma; then, this matrix is exponentiated for the log-Normal case. Roughly, the same procedure applies for drawing the values of the regressor. There are two distribution functions available, the normal and the uniform in the interval U[0,1]U[0,1]; the regressors are always independent.


The default output ("matrix") is a list with a vector YY of order (TmNGx1) with the values generated for the explained variable in the G equations of the SUR and a matrix XXXX of order ((TmNGxsum(p)), with the values generated for the regressors of the SUR, including an intercept for each equation.

In case of Tm = 1 or G = 1 several alternatives output can be select:


Fernando Lopez fernando.lopez@upct.es
Roman Minguez roman.minguez@uclm.es
Jesus Mur jmur@unizar.es


## VIP: The output of the whole set of the examples can be examined 
## by executing demo(demo_dgp_spsur, package="spsur")

### PANEL DATA (Tm = 1 or G = 1)              ##

#### Example 1: DGP SLM model. G equations
rm(list = ls()) # Clean memory
Tm <- 1 # Number of time periods
G <- 3 # Number of equations
N <- 200 # Number of spatial elements
p <- 3 # Number of independent variables
Sigma <- matrix(0.3, ncol = G, nrow = G)
diag(Sigma) <- 1
Betas <- c(1, 2, 3, 1, -1, 0.5, 1, -0.5, 2)
rho <- 0.5 # level of spatial dependence
lambda <- 0.0 # spatial autocorrelation error term = 0
##  random coordinates
co <- cbind(runif(N,0,1),runif(N,0,1))
lw <- spdep::nb2listw(spdep::knn2nb(spdep::knearneigh(co, k = 5,
                                                   longlat = FALSE)))
DGP <- dgp_spsur(Sigma = Sigma, Betas = Betas,
                 rho = rho, lambda = lambda, Tm = Tm,
                 G = G, N = N, p = p, listw = lw)

SLM <- spsurml(X = DGP$X, Y = DGP$Y, Tm = Tm, N = N, G = G, 
               p = c(3, 3, 3), listw = lw, type = "slm") 


rm(list = ls()) # Clean memory
Tm <- 10 # Number of time periods
G <- 3 # Number of equations
N <- 100 # Number of spatial elements
p <- 3 # Number of independent variables
Sigma <- matrix(0.5, ncol = G, nrow = G)
diag(Sigma) <- 1
Betas <- rep(1:3, G)
rho <- c(0.5, 0.1, 0.8)
lambda <- 0.0 # spatial autocorrelation error term = 0
## random coordinates
co <- cbind(runif(N,0,1),runif(N,0,1))
lw <- spdep::nb2listw(spdep::knn2nb(spdep::knearneigh(co, k = 5,
                                                   longlat = FALSE)))
DGP4 <- dgp_spsur(Sigma = Sigma, Betas = Betas, rho = rho, 
                  lambda = lambda, Tm = Tm, G = G, N = N, p = p, 
                  listw = lw)
SLM4  <- spsurml(Y = DGP4$Y, X = DGP4$X, G = G, N = N, Tm = Tm,
                 p = p, listw = lw, type = "slm")

