R: Estimating network formation models with degree...

homophily.fe {CDatanet}

R Documentation

Estimating network formation models with degree heterogeneity: the fixed effect approach

Description

homophily.fe implements a Logit estimator for network formation model with homophily. The model includes degree heterogeneity using fixed effects (see details).

Usage

homophily.fe(
  network,
  formula,
  data,
  symmetry = FALSE,
  fe.way = 1,
  init = NULL,
  opt.ctr = list(maxit = 10000, eps_f = 1e-09, eps_g = 1e-09),
  print = TRUE
)

Arguments

`network`	matrix or list of sub-matrix of social interactions containing 0 and 1, where links are represented by 1
`formula`	an object of class formula: a symbolic description of the model. The `formula` should be as for example `~ x1 + x2` where `x1`, `x2` are explanatory variable of links formation. If missing, the model is estimated with fixed effects only.
`data`	an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which `homophily` is called.
`symmetry`	indicates whether the network model is symmetric (see details).
`fe.way`	indicates whether it is a one-way or two-way fixed effect model. The expected value is 1 or 2 (see details).
`init`	(optional) either a list of starting values containing `beta`, an K-dimensional vector of the explanatory variables parameter, `mu` an n-dimensional vector, and `nu` an n-dimensional vector, where K is the number of explanatory variables and n is the number of individuals; or a vector of starting value for `c(beta, mu, nu)`.
`opt.ctr`	(optional) is a list of `maxit`, `eps_f`, and `eps_g`, which are control parameters used by the solver `optim_lbfgs`, of the package RcppNumerical.
`print`	Boolean indicating if the estimation progression should be printed.

Details

Let p_{ij} be a probability for a link to go from the individual i to the individual j. This probability is specified for two-way effect models (fe.way = 2) as

p_{ij} = F(\mathbf{x}_{ij}'\beta + \mu_j + \nu_j)

where F is the cumulative of the standard logistic distribution. Unobserved degree heterogeneity is captured by \mu_i and \nu_j. The latter are treated as fixed effects (see homophily.re for random effect models). As shown by Yan et al. (2019), the estimator of the parameter \beta is biased. A bias correction is then necessary and is not implemented in this version. However the estimator of \mu_i and \nu_j are consistent.
For one-way fixed effect models (fe.way = 1), \nu_j = \mu_j. For symmetric models, the network is not directed and the fixed effects need to be one way.

Value

A list consisting of:

`model.info`	list of model information, such as the type of fixed effects, whether the model is symmetric, number of observations, etc.
`estimate`	maximizer of the log-likelihood.
`loglike`	maximized log-likelihood.
`optim`	returned value of the optimization solver, which contains details of the optimization. The solver used is `optim_lbfgs` of the package RcppNumerical.
`init`	returned list of starting value.
`loglike(init)`	log-likelihood at the starting value.

References

Yan, T., Jiang, B., Fienberg, S. E., & Leng, C. (2019). Statistical inference in a directed network model with covariates. Journal of the American Statistical Association, 114(526), 857-868, doi:10.1080/01621459.2018.1448829.

Examples


set.seed(1234)
M            <- 2 # Number of sub-groups
nvec         <- round(runif(M, 20, 50))
beta         <- c(.1, -.1)
Glist        <- list()
dX           <- matrix(0, 0, 2)
mu           <- list()
nu           <- list()
Emunu        <- runif(M, -1.5, 0) #expectation of mu + nu
smu2         <- 0.2
snu2         <- 0.2
for (m in 1:M) {
  n          <- nvec[m]
  mum        <- rnorm(n, 0.7*Emunu[m], smu2)
  num        <- rnorm(n, 0.3*Emunu[m], snu2)
  X1         <- rnorm(n, 0, 1)
  X2         <- rbinom(n, 1, 0.2)
  Z1         <- matrix(0, n, n)  
  Z2         <- matrix(0, n, n)
  
  for (i in 1:n) {
    for (j in 1:n) {
      Z1[i, j] <- abs(X1[i] - X1[j])
      Z2[i, j] <- 1*(X2[i] == X2[j])
    }
  }
  
  Gm           <- 1*((Z1*beta[1] + Z2*beta[2] +
                       kronecker(mum, t(num), "+") + rlogis(n^2)) > 0)
  diag(Gm)     <- 0
  diag(Z1)     <- NA
  diag(Z2)     <- NA
  Z1           <- Z1[!is.na(Z1)]
  Z2           <- Z2[!is.na(Z2)]
  
  dX           <- rbind(dX, cbind(Z1, Z2))
  Glist[[m]]   <- Gm
  mu[[m]]      <- mum
  nu[[m]]      <- num
}

mu  <- unlist(mu)
nu  <- unlist(nu)

out   <- homophily.fe(network =  Glist, formula = ~ -1 + dX, fe.way = 2)
muhat <- out$estimate$mu
nuhat <- out$estimate$nu
plot(mu, muhat)
plot(nu, nuhat)

[Package CDatanet version 2.2.0 Index]