simcdnet {CDatanet}R Documentation

Simulate data from Count Data Model with Social Interactions

Description

simcdnet is used simulate counting data with rational expectations (see details). The model is presented in Houndetoungan (2022).

Usage

simcdnet(
  formula,
  contextual,
  Glist,
  theta,
  deltabar,
  delta = NULL,
  rho = 0,
  tol = 1e-10,
  maxit = 500,
  data
)

Arguments

formula

an object of class formula: a symbolic description of the model. The formula should be as for example y ~ x1 + x2 | x1 + x2 where y is the endogenous vector, the listed variables before the pipe, x1, x2 are the individual exogenous variables and the listed variables after the pipe, x1, x2 are the contextual observable variables. Other formulas may be y ~ x1 + x2 for the model without contextual effects, y ~ -1 + x1 + x2 | x1 + x2 for the model without intercept or y ~ x1 + x2 | x2 + x3 to allow the contextual variable to be different from the individual variables.

contextual

(optional) logical; if true, this means that all individual variables will be set as contextual variables. Set the formula as y ~ x1 + x2 and contextual as TRUE is equivalent to set the formula as y ~ x1 + x2 | x1 + x2.

Glist

the adjacency matrix or list sub-adjacency matrix.

theta

the true value of the vector \theta = (\lambda, \beta', \gamma')'. The parameter \gamma should be removed if the model does not contain contextual effects (see details).

deltabar

the true value of \bar{\delta}.

delta

the true value of the vector \delta = (\delta_2, ..., \delta_{\bar{R}}). If NULL, then \bar{R} is set to one and delta is empty.

rho

the true value of \rho.

tol

the tolerance value used in the Fixed Point Iteration Method to compute the expectancy of y. The process stops if the L_1 distance between two consecutive values of the expectancy of y is less than tol.

maxit

the maximal number of iterations in the Fixed Point Iteration Method.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which mcmcARD is called.

Details

Following Houndetoungan (2022), the count data \mathbf{y} is generated from a latent variable \mathbf{y}^*. The latent variable is given for all i as

y_i^* = \lambda \mathbf{g}_i \mathbf{E}(\bar{\mathbf{y}}|\mathbf{X},\mathbf{G}) + \mathbf{x}_i'\beta + \mathbf{g}_i\mathbf{X}\gamma + \epsilon_i,

where \epsilon_i \sim N(0, 1).
Then, y_i = r iff a_r \leq y_i^* \leq a_{r+1}, where a_0 = -\inf, a_1 = 0, a_r = \sum_{k = 1}^r\delta_k. The parameter are subject to the constraints \delta_r \geq \lambda if 1 \leq r \leq \bar{R}, and \delta_r = (r - \bar{R})^{\rho}\bar{\delta} + \lambda if r \geq \bar{R} + 1.

Value

A list consisting of:

yst

ys (see details), the latent variable.

y

the observed count data.

yb

ybar (see details), the expectation of y.

Gyb

the average of the expectation of y among friends.

marg.effects

the marginal effects.

rho

the return value of rho.

Rmax

infinite sums in the marginal effects are approximated by sums up to Rmax.

iteration

number of iterations performed by sub-network in the Fixed Point Iteration Method.

References

Houndetoungan, E. A. (2022). Count Data Models with Social Interactions under Rational Expectations. Available at SSRN 3721250, doi:10.2139/ssrn.3721250.

See Also

cdnet, simsart, simsar.

Examples


# Groups' size
M      <- 5 # Number of sub-groups
nvec   <- round(runif(M, 100, 1000))
n      <- sum(nvec)

# Parameters
lambda <- 0.4
beta   <- c(1.5, 2.2, -0.9)
gamma  <- c(1.5, -1.2)
delta  <- c(1, 0.87, 0.75, 0.6)
delbar <- 0.05
theta  <- c(lambda, beta, gamma)

# X
X      <- cbind(rnorm(n, 1, 1), rexp(n, 0.4))

# Network
Glist  <- list()

for (m in 1:M) {
  nm           <- nvec[m]
  Gm           <- matrix(0, nm, nm)
  max_d        <- 30
  for (i in 1:nm) {
    tmp        <- sample((1:nm)[-i], sample(0:max_d, 1))
    Gm[i, tmp] <- 1
  }
  rs           <- rowSums(Gm); rs[rs == 0] <- 1
  Gm           <- Gm/rs
  Glist[[m]]   <- Gm
}


# data
data    <- data.frame(x1 = X[,1], x2 =  X[,2])

rm(list = ls()[!(ls() %in% c("Glist", "data", "theta", "delta", "delbar"))])

ytmp    <- simcdnet(formula = ~ x1 + x2 | x1 + x2, Glist = Glist, theta = theta, 
                    deltabar = delbar, delta = delta, rho = 0, data = data)

y       <- ytmp$y

# plot histogram
hist(y, breaks = max(y))

[Package CDatanet version 2.1.3 Index]