fitNSBM {noisySBM}R Documentation

VEM algorithm to adjust the noisy stochastic block model to an observed dense adjacency matrix

Description

fitNSBM() estimates model parameters of the noisy stochastic block model and provides a clustering of the nodes

Usage

fitNSBM(
  dataMatrix,
  model = "Gauss0",
  sbmSize = list(Qmin = 1, Qmax = NULL, explor = 1.5),
  filename = NULL,
  initParam = list(nbOfTau = NULL, nbOfPointsPerTau = NULL, maxNbOfPasses = NULL,
    minNbOfPasses = 1),
  nbCores = parallel::detectCores()
)

Arguments

dataMatrix

observed dense adjacency matrix

model

Implemented models:

Gauss

all Gaussian parameters of the null and the alternative distributions are unknown ; this is the Gaussian model with maximum number of unknown parameters

Gauss0

compared to Gauss, the mean of the null distribution is set to 0

Gauss01

compared to Gauss, the null distribution is set to N(0,1)

GaussEqVar

compared to Gauss, all Gaussian variances (of both the null and the alternative) are supposed to be equal, but unknown

Gauss0EqVar

compared to GaussEqVar, the mean of the null distribution is set to 0

Gauss0Var1

compared to Gauss, all Gaussian variances are set to 1 and the null distribution is set to N(0,1)

Gauss2distr

the alternative distribution is a single Gaussian distribution, i.e. the block memberships of the nodes do not influence on the alternative distribution

GaussAffil

compared to Gauss, for the alternative distribution, there's a distribution for inter-group and another for intra-group interactions

Exp

the null and the alternatives are all exponential distributions (i.e. Gamma distributions with shape parameter equal to one) with unknown scale parameters

ExpGamma

the null distribution is an unknown exponential, the alterantive distribution are Gamma distributions with unknown parameters

sbmSize

list of parameters determining the size of SBM (the number of latent blocks) to be expored

Qmin

minimum number of latent blocks

Qmax

maximum number of latent blocks

explor

if Qmax is not provided, then Qmax is automatically determined as explor times the number of blocks where the ICL is maximal

filename

results are saved in a file with this name (if provided)

initParam

list of parameters that fix the number of initializations

nbOfTau

number of initial points for the node clustering (i. e. for the variational parameters tau)

nbOfPointsPerTau

number of initial points of the latent binary graph

maxNbOfPasses

maximum number of passes through the SBM models, that is, passes from Qmin to Qmax or inversely

minNbOfPasses

minimum number of passes through the SBM models

nbCores

number of cores used for parallelization

Details

fitNSBM() supports different probability distributions for the edges and can estimate the number of node blocks

Value

Returns a list of estimation results for all numbers of latent blocks considered by the algorithm. Every element is a list composed of:

theta

estimated parameters of the noisy stochastic block model; a list with the following elements:

pi

parameter estimate of pi

w

parameter estimate of w

nu0

parameter estimate of nu0

nu

parameter estimate of nu

clustering

node clustering obtained by the noisy stochastic block model, more precisely, a hard clustering given by the maximum aposterior estimate of the variational parameters sbmParam$edgeProba

sbmParam

further results concerning the latent binary stochastic block model. A list with the following elements:

Q

number of latent blocks in the noisy stochastic block model

clusterProba

soft clustering given by the conditional probabilities of a node to belong to a given latent block. In other words, these are the variational parameters tau; (Q x n)-matrix

edgeProba

conditional probabilities rho of an edges given the node memberships of the interacting nodes; (N_Q x N)-matrix

ICL

value of the ICL criterion at the end of the algorithm

convergence

a list of convergence indicators:

J

value of the lower bound of the log-likelihood function at the end of the algorithm

complLogLik

value of the complete log-likelihood function at the end of the algorithm

converged

indicates if algorithm has converged

nbIter

number of iterations performed

Examples

n <- 10
theta <- list(pi= c(0.5,0.5), nu0=c(0,.1),
       nu=matrix(c(-2,10,-2, 1,1,1),3,2),  w=c(.5, .9, .3))
obs <- rnsbm(n, theta, modelFamily='Gauss')
res <- fitNSBM(obs$dataMatrix, sbmSize = list(Qmax=3),
       initParam=list(nbOfTau=1, nbOfPointsPerTau=1), nbCores=1)

[Package noisySBM version 0.1.4 Index]