R: Fits a latent ordered network model using Monte Carlo...

lologVariational {lolog}

R Documentation

Fits a latent ordered network model using Monte Carlo variational inference

Description

Fits a latent ordered network model using Monte Carlo variational inference

Usage

lologVariational(
  formula,
  nReplicates = 5L,
  dyadInclusionRate = NULL,
  edgeInclusionRate = NULL,
  targetFrameSize = 5e+05
)

Arguments

`formula`	A lolog formula. See `link{lolog}`
`nReplicates`	An integer controlling how many dyad ordering to perform.
`dyadInclusionRate`	Controls what proportion of non-edges in each ordering should be dropped.
`edgeInclusionRate`	Controls what proportion of edges in each ordering should be dropped.
`targetFrameSize`	Sets dyadInclusionRate so that the model frame for the logistic regression will have on average this amount of observations.

Details

This function approximates the maximum likelihood solution via a variational inference on the graph (y) over the latent edge variable inclusion order (s). Specifically, it replaces the conditional probability p(s | y) by p(s). If the LOLOG model contains only dyad independent terms, then these two probabilities are identical, and thus variational inference is exactly maximum likelihood inference. The objective function is

E_{p(s)}\bigg(\log p(y| S, \theta) \bigg)

This can be approximated by drawing samples from p(s) to approximate the expectation. The number of samples is controlled by the nReplicates parameter. The memory required is on the order of nReplicates * (# of dyads). For large networks this can be impractical, so adjusting dyadInclusionRate and edgeInclusionRate allows one to down sample the # of dyads in each replicate. By default these are set attempting to achieve as equal a number of edges and non-edges as possible while targeting a model frame with targetFrameSize number of rows.

If the model is dyad independent, replicates are redundant, and so nReplicates is set to 1 with a note.

The functional form of the objective function is equivalent to logistic regression, and so the glm function is used to maximize it. The asymptotic covariance of the parameter estimates is calculated using the methods of Westling (2015).

Value

An object of class c('lologVariationalFit','lolog','list') consisting of the following items:

`formula`	The model formula
`method`	"variational"
`theta`	The fit parameter values
`vcov`	The asymptotic covariance matrix for the parameter values.
`nReplicates`	The number of replicates
`dyadInclusionRate`	The rate at which non-edges are included
`edgeInclusionRate`	The rate at which edges are included
`allDyadIndependent`	Logical indicating model dyad independence
`likelihoodModel`	An object of class *LatentOrderLikelihood at the fit parameters
`outcome`	The outcome vector for the logistic regression
`predictors`	The change statistic predictor matrix for the logistic regression

References

Westling, T., & McCormick, T. H. (2015). Beyond prediction: A framework for inference with variational approximations in mixture models. arXiv preprint arXiv:1510.08151.

Examples

library(network)
data(ukFaculty)

# Delete vertices missing group
delete.vertices(ukFaculty, which(is.na(ukFaculty %v% "Group")))

fit <- lologVariational(ukFaculty ~ edges() + nodeMatch("GroupC"),
                       nReplicates=1L, dyadInclusionRate=1)
summary(fit)

[Package lolog version 1.3.1 Index]