cdnet {CDatanet} | R Documentation |
Estimating count data models with social interactions under rational expectations using the NPL method
Description
cdnet
estimates count data models with social interactions under rational expectations using the NPL algorithm (see Houndetoungan, 2024).
Usage
cdnet(
formula,
Glist,
group,
Rmax,
Rbar,
starting = list(lambda = NULL, Gamma = NULL, delta = NULL),
Ey0 = NULL,
ubslambda = 1L,
optimizer = "fastlbfgs",
npl.ctr = list(),
opt.ctr = list(),
cov = TRUE,
data
)
Arguments
formula |
a class object formula: a symbolic description of the model. |
Glist |
adjacency matrix. For networks consisting of multiple subnets, |
group |
the vector indicating the individual groups. The default assumes a common group. For 2 groups; that is, |
Rmax |
an integer indicating the theoretical upper bound of |
Rbar |
an |
starting |
(optional) a starting value for |
Ey0 |
(optional) a starting value for |
ubslambda |
a positive value indicating the upper bound of |
optimizer |
is either |
npl.ctr |
a list of controls for the NPL method (see details). |
opt.ctr |
a list of arguments to be passed in |
cov |
a Boolean indicating if the covariance should be computed. |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables
in the model. If not found in data, the variables are taken from |
Details
Model
The count variable y_i
take the value r
with probability.
P_{ir} = F(\sum_{s = 1}^S \lambda_s \bar{y}_i^{e,s} + \mathbf{z}_i'\Gamma - a_{h(i),r}) - F(\sum_{s = 1}^S \lambda_s \bar{y}_i^{e,s} + \mathbf{z}_i'\Gamma - a_{h(i),r + 1}).
In this equation, \mathbf{z}_i
is a vector of control variables; F
is the distribution function of the standard normal distribution;
\bar{y}_i^{e,s}
is the average of E(y)
among peers using the s
-th network definition;
a_{h(i),r}
is the r
-th cut-point in the cost group h(i)
.
The following identification conditions have been introduced: \sum_{s = 1}^S \lambda_s > 0
, a_{h(i),0} = -\infty
, a_{h(i),1} = 0
, and
a_{h(i),r} = \infty
for any r \geq R_{\text{max}} + 1
. The last condition implies that P_{ir} = 0
for any r \geq R_{\text{max}} + 1
.
For any r \geq 1
, the distance between two cut-points is a_{h(i),r+1} - a_{h(i),r} = \delta_{h(i),r} + \sum_{s = 1}^S \lambda_s
As the number of cut-point can be large, a quadratic cost function is considered for r \geq \bar{R}_{h(i)}
, where \bar{R} = (\bar{R}_{1}, ..., \bar{R}_{L})
.
With the semi-parametric cost-function,
a_{h(i),r + 1} - a_{h(i),r}= \bar{\delta}_{h(i)} + \sum_{s = 1}^S \lambda_s
.
The model parameters are: \lambda = (\lambda_1, ..., \lambda_S)'
, \Gamma
, and \delta = (\delta_1', ..., \delta_L')'
,
where \delta_l = (\delta_{l,2}, ..., \delta_{l,\bar{R}_l}, \bar{\delta}_l)'
for l = 1, ..., L
.
The number of single parameters in \delta_l
depends on R_{\text{max}}
and \bar{R}_{l}
. The components \delta_{l,2}, ..., \delta_{l,\bar{R}_l}
or/and
\bar{\delta}_l
must be removed in certain cases.
If R_{\text{max}} = \bar{R}_{l} \geq 2
, then \delta_l = (\delta_{l,2}, ..., \delta_{l,\bar{R}_l})'
.
If R_{\text{max}} = \bar{R}_{l} = 1
(binary models), then \delta_l
must be empty.
If R_{\text{max}} > \bar{R}_{l} = 1
, then \delta_l = \bar{\delta}_l
.
npl.ctr
The model parameters are estimated using the Nested Partial Likelihood (NPL) method. This approach
starts with a guess of \theta
and E(y)
and constructs iteratively a sequence
of \theta
and E(y)
. The solution converges when the \ell_1
-distance
between two consecutive \theta
and E(y)
is less than a tolerance.
The argument npl.ctr
must include
- tol
the tolerance of the NPL algorithm (default 1e-4),
- maxit
the maximal number of iterations allowed (default 500),
a boolean indicating if the estimate should be printed at each step.
- S
the number of simulations performed use to compute integral in the covariance by important sampling.
Value
A list consisting of:
info |
a list of general information about the model. |
estimate |
the NPL estimator. |
Ey |
|
GEy |
the average of |
cov |
a list including (if |
details |
step-by-step output as returned by the optimizer. |
References
Houndetoungan, E. A. (2024). Count Data Models with Social Interactions under Rational Expectations. Available at SSRN 3721250, doi:10.2139/ssrn.3721250.
See Also
Examples
set.seed(123)
M <- 5 # Number of sub-groups
nvec <- round(runif(M, 100, 200))
n <- sum(nvec)
# Adjacency matrix
A <- list()
for (m in 1:M) {
nm <- nvec[m]
Am <- matrix(0, nm, nm)
max_d <- 30 #maximum number of friends
for (i in 1:nm) {
tmp <- sample((1:nm)[-i], sample(0:max_d, 1))
Am[i, tmp] <- 1
}
A[[m]] <- Am
}
Anorm <- norm.network(A) #Row-normalization
# X
X <- cbind(rnorm(n, 1, 3), rexp(n, 0.4))
# Two group:
group <- 1*(X[,1] > 0.95)
# Networks
# length(group) = 2 and unique(sort(group)) = c(0, 1)
# The networks must be defined as to capture:
# peer effects of `0` on `0`, peer effects of `1` on `0`
# peer effects of `0` on `1`, and peer effects of `1` on `1`
G <- list()
cums <- c(0, cumsum(nvec))
for (m in 1:M) {
tp <- group[(cums[m] + 1):(cums[m + 1])]
Am <- A[[m]]
G[[m]] <- norm.network(list(Am * ((1 - tp) %*% t(1 - tp)),
Am * ((1 - tp) %*% t(tp)),
Am * (tp %*% t(1 - tp)),
Am * (tp %*% t(tp))))
}
# Parameters
lambda <- c(0.2, 0.3, -0.15, 0.25)
Gamma <- c(4.5, 2.2, -0.9, 1.5, -1.2)
delta <- rep(c(2.6, 1.47, 0.85, 0.7, 0.5), 2)
# Data
data <- data.frame(X, peer.avg(Anorm, cbind(x1 = X[,1], x2 = X[,2])))
colnames(data) = c("x1", "x2", "gx1", "gx2")
ytmp <- simcdnet(formula = ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2),
lambda = lambda, Gamma = Gamma, delta = delta, group = group,
data = data)
y <- ytmp$y
hist(y, breaks = max(y) + 1)
table(y)
# Estimation
est <- cdnet(formula = y ~ x1 + x2 + gx1 + gx2, Glist = G, Rbar = rep(5, 2), group = group,
optimizer = "fastlbfgs", data = data,
opt.ctr = list(maxit = 5e3, eps_f = 1e-11, eps_g = 1e-11))
summary(est)