gof {NetMix} | R Documentation |
Posterior predictive checks using structural network charactericts
Description
The function generates a variety of plots that serve as posterior predictive checks on the goodness of fit of a fitted mmsbm
object.
Usage
gof(x, ...)
## S3 method for class 'mmsbm'
gof(
x,
gof_stat = c("Geodesics", "Degree"),
level = 0.95,
samples = 50,
new.data.dyad = NULL,
new.data.monad = NULL,
seed = NULL,
...
)
Arguments
x |
An object of class |
... |
Currently ignored. |
gof_stat |
Character vector. Accepts any subset from "Geodesics","Degree", "Indegree", "Outdegree", "3-Motifs", "Dyad Shared Partners", "Edge Shared Partners", and "Incoming K-stars". See details. |
level |
Double. Level of credible interval for posterior predictive distribution around structural quantities of interest. |
samples |
Integer. Number of sampled networks from model's posterior predictive using |
new.data.dyad |
See |
new.data.monad |
See |
seed |
See |
Details
Goodness of fit of network models has typically been established by evaluating how the structural characteristics of predicted networks
compare to those of the observed network. When estimated in a Bayesian framework, this approach is equivalent to
conducting posterior preditive checks on these structural quantities of interest. When new.data.dyad
and/or new.data.monad
are
passed that are different from those used in estimation, this is equivalent to conducting posterior predictive checks out-of-sample.
The set of structural features used to determine goodness of fit is somewhat arbitrary, and chosen mostly to incorporate various first order, second order, and (to the extent possible) third-order characteristics of the network. "Geodesics" focuses on the distribution over observed and predicted geodesic distances between nodes; "Indegree" and "Outdegree" focuses on the distribution over incoming and outgoing connections per node; "3-motifs" focus on a distribution over possible connectivity patterns between triads (i.e. the triadic census); "Dyad Shared Partners" focuses on the distribution over the number of shared partners between any two dayds; "Edge Shared Partners" is similarly defined, but w.r.t. edges, rather than dyads; and finally "Incoming K-stars" focuses on a frequency distribution over stars with k=1,... spokes.
Obtaining samples of the last three structural features can be very computationally expensive, and is discouraged on networks with more than 50 nodes.
Value
A ggplot
object.
Author(s)
Santiago Olivella (olivella@unc.edu), Adeline Lo (aylo@wisc.edu), Tyler Pratt (tyler.pratt@yale.edu), Kosuke Imai (imai@harvard.edu)
Examples
library(NetMix)
## Load datasets
data("lazega_dyadic")
data("lazega_monadic")
## Estimate model with 2 groups
lazega_mmsbm <- mmsbm(SocializeWith ~ Coworkers,
senderID = "Lawyer1",
receiverID = "Lawyer2",
nodeID = "Lawyer",
data.dyad = lazega_dyadic,
data.monad = lazega_monadic,
n.blocks = 2,
mmsbm.control = list(seed = 123,
conv_tol = 1e-2,
hessian = FALSE))
## Plot observed (red) and simulated (gray) distributions over
## indegrees
## (typically a larger number of samples would be taken)
## (strictly requires ggplot2)
gof(lazega_mmsbm, gof_stat = "Indegree", samples = 2)