simsar {CDatanet}R Documentation

Simulate data from the linear-in-mean Model with Social Interactions


simsar is used to simulate continuous variables with social interactions (see details). The model is presented in Lee(2004).


simsar(formula, contextual, Glist, theta, data)



an object of class formula: a symbolic description of the model. The formula should be as for example y ~ x1 + x2 | x1 + x2 where y is the endogenous vector, the listed variables before the pipe, x1, x2 are the individual exogenous variables and the listed variables after the pipe, x1, x2 are the contextual observable variables. Other formulas may be y ~ x1 + x2 for the model without contextual effects, y ~ -1 + x1 + x2 | x1 + x2 for the model without intercept or y ~ x1 + x2 | x2 + x3 to allow the contextual variable to be different from the individual variables.


(optional) logical; if true, this means that all individual variables will be set as contextual variables. Set the formula as y ~ x1 + x2 and contextual as TRUE is equivalent to set the formula as y ~ x1 + x2 | x1 + x2.


the adjacency matrix or list sub-adjacency matrix.


the parameter value as \theta = (\lambda, \beta, \gamma, \sigma). The parameter \gamma should be removed if the model does not contain contextual effects (see details).


an optional data frame, list or environment (or object coercible by to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which mcmcARD is called.


The variable \mathbf{y} is given for all i as

y_i = \lambda \mathbf{g}_i y + \mathbf{x}_i'\beta + \mathbf{g}_i\mathbf{X}\gamma + \epsilon_i,

where \epsilon_i \sim N(0, \sigma^2).


A list consisting of:


the observed count data.


the average of y among friends.


Lee, L. F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica, 72(6), 1899-1925, doi:10.1111/j.1468-0262.2004.00558.x.

See Also

sar, simsart, simcdnet.


# Groups' size
M      <- 5 # Number of sub-groups
nvec   <- round(runif(M, 100, 1000))
n      <- sum(nvec)

# Parameters
lambda <- 0.4
beta   <- c(2, -1.9, 0.8)
gamma  <- c(1.5, -1.2)
sigma  <- 1.5
theta  <- c(lambda, beta, gamma, sigma)

# X
X      <- cbind(rnorm(n, 1, 1), rexp(n, 0.4))

# Network
Glist  <- list()

for (m in 1:M) {
  nm           <- nvec[m]
  Gm           <- matrix(0, nm, nm)
  max_d        <- 30
  for (i in 1:nm) {
    tmp        <- sample((1:nm)[-i], sample(0:max_d, 1))
    Gm[i, tmp] <- 1
  rs           <- rowSums(Gm); rs[rs == 0] <- 1
  Gm           <- Gm/rs
  Glist[[m]]   <- Gm

# data
data    <- data.frame(x1 = X[,1], x2 =  X[,2])

rm(list = ls()[!(ls() %in% c("Glist", "data", "theta"))])

ytmp    <- simsar(formula = ~ x1 + x2 | x1 + x2, Glist = Glist,
                     theta = theta, data = data) 
y       <- ytmp$y

# plot histogram

[Package CDatanet version 2.1.2 Index]