mcmcSAR {PartialNetwork}R Documentation

Bayesian Estimator of SAR model

Description

mcmcSAR implements the Bayesian estimator of the linear-in-mean SAR model when only the linking probabilities are available or can be estimated.

Usage

mcmcSAR(
  formula,
  contextual,
  start,
  G0.obs,
  G0 = NULL,
  mlinks = list(),
  hyperparms = list(),
  ctrl.mcmc = list(),
  iteration = 2000L,
  data
)

Arguments

formula

object of class formula: a symbolic description of the model. The formula should be as for example y ~ x1 + x2 | x1 + x2 where y is the endogenous vector, the listed variables before the pipe, x1, x2 are the individual exogenous variables and the listed variables after the pipe, x1, x2 are the contextual observable variables. Other formulas may be y ~ x1 + x2 for the model without contextual effects, y ~ -1 + x1 + x2 | x1 + x2 for the model without intercept, or y ~ x1 + x2 | x2 + x3 to allow the contextual variables to be different from the individual variables.

contextual

(optional) logical; if true, this means that all individual variables will be set as contextual variables. Set formula as y ~ x1 + x2 and contextual as TRUE is equivalent to set formula as y ~ x1 + x2 | x1 + x2.

start

(optional) vector of starting value of the model parameter as (\beta' ~ \gamma' ~ \alpha ~ \sigma^2)', where \beta is the individual variables parameter, \gamma is the contextual variables parameter, \alpha is the peer effect parameter and \sigma^2 the variance of the error term. If the start is missing, a Maximum Likelihood estimator will be used, where the network matrix is that given through the argument G0 (if provided) or generated from it distribution.

G0.obs

list of matrices (or simply matrix if the list contains only one matrix) indicating the part of the network data which is observed. If the (i,j)-th element of the m-th matrix is one, then the element at the same position in the network data will be considered as observed and will not be inferred in the MCMC. In contrast, if the (i,j)-th element of the m-th matrix is zero, the element at the same position in the network data will be considered as a starting value of the missing link which will be inferred. G0.obs can also take "none" when no part of the network data is observed (equivalent to the case where all the entries are zeros) and "all" when the network data is fully observed (equivalent to the case where all the entries are ones).

G0

list of sub-network matrices (or simply network matrix if there is only one sub-network). G0 is made up of starting values for the entries with missing network data and observed values for the entries with observed network data. G0 is optional when G0.obs = "none".

mlinks

list specifying the network formation model (see Section Network formation model in Details).

hyperparms

(optional) is a list of hyperparameters (see Section Hyperparameters in Details).

ctrl.mcmc

list of MCMC controls (see Section MCMC control in Details).

iteration

number of MCMC steps to be performed.

data

optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If missing, the variables are taken from environment(formula), typically the environment from which mcmcSAR is called.

Details

Outcome model

The model is given by

\mathbf{y} = \mathbf{X}\beta + \mathbf{G}\mathbf{X}\gamma + \alpha \mathbf{G}\mathbf{y} + \epsilon.

where

\epsilon \sim N(0, \sigma^2).

The parameters to estimate in this model are the matrix \mathbf{G}, the vectors \beta, \gamma and the scalar \alpha, \sigma. Prior distributions are assumed on \mathbf{A}, the adjacency matrix in which \mathbf{A}_{ij} = 1 if i is connected to j and \mathbf{A}_{ij} = 0 otherwise, and on \beta, \gamma, \alpha and \sigma^2.

\mathbf{A}_{ij} \sim Bernoulli(\mathbf{P}_{ij})

(\beta' ~ \gamma')'|\sigma^2 \sim \mathcal{N}(\mu_{\theta}, \sigma^2\Sigma_{\theta})

\zeta = \log\left(\frac{\alpha}{1 - \alpha}\right) \sim \mathcal{N}(\mu_{\zeta}, \sigma_{\zeta}^2)

\sigma^2 \sim IG(\frac{a}{2}, \frac{b}{2})

where \mathbf{P} is the linking probability. The linking probability is an hyperparameters that can be set fixed or updated using a network formation model.

Network formation model

The linking probability can be set fixed or updated using a network formation model. Information about how \mathbf{P} should be handled in in the MCMC can be set through the argument mlinks which should be a list with named elements. Divers specifications of network formation model are possible. The list assigned to mlist should include an element named model. The expected values of model are "none" (default value), "logit", "probit", and "latent space".

Fixed network distribution

To set \mathbf{P} fixed, mlinks could contain,

Probit and Logit models

For the Probit and Logit specification as network formation model, the following elements could be declared in mlinks.

Latent space models

The following element could be declared in mlinks.

Hyperparameters

All the hyperparameters can be defined through the argument hyperparms (a list) and should be named as follow.

Inverses are used for the prior variance through the argument hyperparms in order to allow non informative prior. Set the inverse of the prior variance to 0 is equivalent to assume a non informative prior.

MCMC control

During the MCMC, the jumping scales of \alpha and \rho are updated following Atchade and Rosenthal (2005) in order to target the acceptance rate to the target value. This requires to set a minimal and a maximal jumping scales through the parameter ctrl.mcmc. The parameter ctrl.mcmc is a list which can contain the following named components.

If block.max > 1, several entries are randomly chosen from the same row and updated simultaneously. The number of entries chosen is randomly chosen between 1 and block.max. In addition, the entries are not chosen in order. For example, on the row i, the entries (i, 5) and (i, 9) can be updated simultaneously, then the entries (i, 1), (i, 3), (i, 8), and so on.

Value

A list consisting of:

n.group

number of groups.

N

vector of each group size.

time

elapsed time to run the MCMC in second.

iteration

number of MCMC steps performed.

posterior

matrix (or list of matrices) containing the simulations.

hyperparms

return value of hyperparms.

mlinks

return value of mlinks.

accept.rate

acceptance rates.

prop.net

proportion of observed network data.

method.net

network formation model specification.

start

starting values.

formula

input value of formula and mlinks.formula.

contextual

input value of contextual.

ctrl.mcmc

return value of ctrl.mcmc.

See Also

smmSAR, sim.IV

Examples


# We assume that the network is fully observed 
# See our vignette for examples where the network is partially observed 
# Number of groups
M             <- 50
# size of each group
N             <- rep(30,M)
# individual effects
beta          <- c(2,1,1.5)
# contextual effects
gamma         <- c(5,-3)
# endogenous effects
alpha         <- 0.4
# std-dev errors
se            <- 1
# prior distribution
prior         <- runif(sum(N*(N-1)))
prior         <- vec.to.mat(prior, N, normalise = FALSE)
# covariates
X             <- cbind(rnorm(sum(N),0,5),rpois(sum(N),7))
# true network
G0            <- sim.network(prior)
# normalise
G0norm        <- norm.network(G0)
# simulate dependent variable use an external package
y             <- CDatanet::simsar(~ X, contextual = TRUE, Glist = G0norm,
                                  theta = c(alpha, beta, gamma, se))
y             <- y$y
# dataset
dataset       <- as.data.frame(cbind(y, X1 = X[,1], X2 = X[,2]))
out.none1     <- mcmcSAR(formula = y ~ X1 + X2, contextual = TRUE, G0.obs = "all",
                         G0 = G0, data = dataset, iteration = 1e4)
summary(out.none1)
plot(out.none1)
plot(out.none1, plot.type = "dens")


[Package PartialNetwork version 1.0.4 Index]