cdnet {CDatanet}R Documentation

Estimate Count Data Model with Social Interactions using NPL Method


cdnet is used to estimate peer effects on counting data with rational expectations (see details). The model is presented in Houndetoungan (2022).


  Rbar = NULL,
  estim.rho = FALSE,
  starting = list(theta = NULL, deltabar = NULL, delta = NULL, rho = NULL),
  yb0 = NULL,
  optimizer = "fastlbfgs",
  npl.ctr = list(),
  opt.ctr = list(),
  cov = TRUE,



an object of class formula: a symbolic description of the model. The formula should be as for example y ~ x1 + x2 | x1 + x2 where y is the endogenous vector, the listed variables before the pipe, x1, x2 are the individual exogenous variables and the listed variables after the pipe, x1, x2 are the contextual observable variables. Other formulas may be y ~ x1 + x2 for the model without contextual effects, y ~ -1 + x1 + x2 | x1 + x2 for the model without intercept or y ~ x1 + x2 | x2 + x3 to allow the contextual variable to be different from the individual variables.


(optional) logical; if true, this means that all individual variables will be set as contextual variables. Set the the formula as y ~ x1 + x2 and contextual as TRUE is equivalent to set the formula as y ~ x1 + x2 | x1 + x2.


the adjacency matrix or list sub-adjacency matrix.


the value of Rbar. If not provided, it is automatically set at quantile(y, 0.9).


indicates if the parameter \rho should be estimated or set to zero.


(optional) starting value of \theta = (\lambda, \beta', \gamma')', \bar{\delta}, \delta = (\delta_2, ..., \delta_{\bar{R}}), and \rho. The parameter \gamma should be removed if the model does not contain contextual effects (see details).


(optional) expectation of y.


is either fastlbfgs (L-BFGS optimization method of the package RcppNumerical), nlm (referring to the function nlm), or optim (referring to the function optim). Other arguments of these functions such as, control and method can be defined through the argument opt.ctr.


list of controls for the NPL method (see details).


list of arguments to be passed in optim_lbfgs of the package RcppNumerical, nlm or optim (the solver set in optimizer), such as maxit, eps_f, eps_g, control, method, ...


a Boolean indicating if the covariance should be computed.


an optional data frame, list or environment (or object coercible by to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which cdnet is called.



Following Houndetoungan (2022), the count data \mathbf{y} is generated from a latent variable \mathbf{y}^*. The latent variable is given for all i as

y_i^* = \lambda \mathbf{g}_i \mathbf{E}(\bar{\mathbf{y}}|\mathbf{X},\mathbf{G}) + \mathbf{x}_i'\beta + \mathbf{g}_i\mathbf{X}\gamma + \epsilon_i,

where \epsilon_i \sim N(0, 1).
Then, y_i = r iff a_r \leq y_i^* \leq a_{r+1}, where a_0 = -\inf, a_1 = 0, a_r = \sum_{k = 1}^r\delta_k. The parameter are subject to the constraints \delta_r \geq \lambda if 1 \leq r \leq \bar{R}, and \delta_r = (r - \bar{R})^{\rho}\bar{\delta} + \lambda if r \geq \bar{R} + 1. The unknown parameters to be estimated are \lambda, \beta, \gamma, \delta_2, ..., \delta_{\bar{R}}, \bar{\delta}, and \rho.


The model parameters is estimated using the Nested Partial Likelihood (NPL) method. This approach starts with a guess of \theta and \bar{y} and constructs iteratively a sequence of \theta and \bar{y}. The solution converges when the L_1 distance between two consecutive \theta and \bar{y} is less than a tolerance.
The argument npl.ctr is an optional list which contain


A list consisting of:


list of general information about the model.


NPL estimator.


ybar (see details), expectation of y.


average of the expectation of y among friends.


list of covariance matrices.


step-by-step output as returned by the optimizer.


Houndetoungan, E. A. (2022). Count Data Models with Social Interactions under Rational Expectations. Available at SSRN 3721250, doi:10.2139/ssrn.3721250.

See Also

sart, sar, simcdnet.


# Groups' size
nvec   <- rep(100, 2)
M      <- length(nvec)
n      <- sum(nvec)

# Parameters
lambda <- 0.4
beta   <- c(1.5, 2.2, -0.9)
gamma  <- c(1.5, -1.2)
delta  <- c(1, 0.87, 0.75, 0.6)
delbar <- 0.05
rho    <- 0.5
theta  <- c(lambda, beta, gamma)

# X
X      <- cbind(rnorm(n, 1, 1), rexp(n, 0.4))

# Network
Glist  <- list()

for (m in 1:M) {
  nm           <- nvec[m]
  Gm           <- matrix(0, nm, nm)
  max_d        <- 30
  for (i in 1:nm) {
    tmp        <- sample((1:nm)[-i], sample(0:max_d, 1))
    Gm[i, tmp] <- 1
  rs           <- rowSums(Gm); rs[rs == 0] <- 1
  Gm           <- Gm/rs
  Glist[[m]]   <- Gm

# data
data    <- data.frame(x1 = X[,1], x2 =  X[,2])

ytmp    <- simcdnet(formula = ~ x1 + x2 | x1 + x2, Glist = Glist, theta = theta,
                    deltabar = delbar, delta = delta, rho = rho, data = data)

y       <- ytmp$y

# plot histogram
hist(y, breaks = max(y))

data    <- data.frame(yt = y, x1 = data$x1, x2 = data$x2)
rm(list = ls()[!(ls() %in% c("Glist", "data"))])

out   <- cdnet(formula = yt ~ x1 + x2, contextual = TRUE, Glist = Glist, 
               data = data, Rbar = 5, estim.rho = TRUE, optimizer = "nlm")

[Package CDatanet version 2.1.2 Index]