directed_dcsbm {fastRG} | R Documentation |
Create a directed degree corrected stochastic blockmodel object
Description
To specify a degree-corrected stochastic blockmodel, you must specify
the degree-heterogeneity parameters (via n
or theta_out
and
theta_in
), the mixing matrix
(via k_out
and k_in
, or B
), and the relative block
probabilities (optional, via p_out
and pi_in
).
We provide defaults for most of these
options to enable rapid exploration, or you can invest the effort
for more control over the model parameters. We strongly recommend
setting the expected_out_degree
, expected_in_degree
,
or expected_density
argument
to avoid large memory allocations associated with
sampling large, dense graphs.
Usage
directed_dcsbm(
n = NULL,
theta_out = NULL,
theta_in = NULL,
k_out = NULL,
k_in = NULL,
B = NULL,
...,
pi_out = rep(1/k_out, k_out),
pi_in = rep(1/k_in, k_in),
sort_nodes = TRUE,
force_identifiability = TRUE,
poisson_edges = TRUE,
allow_self_loops = TRUE
)
Arguments
n |
(degree heterogeneity) The number of nodes in the blockmodel.
Use when you don't want to specify the degree-heterogeneity
parameters |
theta_out |
(degree heterogeneity) A numeric vector
explicitly specifying the degree heterogeneity
parameters. This implicitly determines the number of nodes
in the resulting graph, i.e. it will have |
theta_in |
(degree heterogeneity) A numeric vector
explicitly specifying the degree heterogeneity
parameters. This implicitly determines the number of nodes
in the resulting graph, i.e. it will have |
k_out |
(mixing matrix) The number of outgoing blocks in the blockmodel.
Use when you don't want to specify the mixing-matrix by hand.
When |
k_in |
(mixing matrix) The number of incoming blocks in the blockmodel.
Use when you don't want to specify the mixing-matrix by hand.
When |
B |
(mixing matrix) A |
... |
Arguments passed on to
|
pi_out |
(relative block probabilities) Relative block
probabilities. Must be positive, but do not need to sum
to one, as they will be normalized internally.
Must match the rows of |
pi_in |
(relative block probabilities) Relative block
probabilities. Must be positive, but do not need to sum
to one, as they will be normalized internally.
Must match the columns of |
sort_nodes |
Logical indicating whether or not to sort the nodes
so that they are grouped by block. Useful for plotting.
Defaults to |
force_identifiability |
Logical indicating whether or not to
normalize |
poisson_edges |
Logical indicating whether or not
multiple edges are allowed to form between a pair of
nodes. Defaults to |
allow_self_loops |
Logical indicating whether or not
nodes should be allowed to form edges with themselves.
Defaults to |
Value
A directed_dcsbm
S3 object, a subclass of the
directed_factor_model()
with the following additional
fields:
-
theta_out
: A numeric vector of incoming community degree-heterogeneity parameters. -
theta_in
: A numeric vector of outgoing community degree-heterogeneity parameters. -
z_out
: The incoming community memberships of each node, as afactor()
. The factor will havek_out
levels, wherek_out
is the number of incoming communities in the stochastic blockmodel. There will not always necessarily be observed nodes in each community. -
z_in
: The outgoing community memberships of each node, as afactor()
. The factor will havek_in
levels, wherek_in
is the number of outgoing communities in the stochastic blockmodel. There will not always necessarily be observed nodes in each community. -
pi_out
: Sampling probabilities for each incoming community. -
pi_in
: Sampling probabilities for each outgoing community. -
sorted
: Logical indicating where nodes are arranged by block (and additionally by degree heterogeneity parameter) within each block.
Generative Model
There are two levels of randomness in a directed degree-corrected
stochastic blockmodel. First, we randomly chose a incoming
block membership and an outgoing block membership
for each node in the blockmodel. This is
handled by directed_dcsbm()
. Then, given these block memberships,
we randomly sample edges between nodes. This second
operation is handled by sample_edgelist()
,
sample_sparse()
, sample_igraph()
and
sample_tidygraph()
, depending on your desired
graph representation.
Block memberships
Let x
represent the incoming block membership of a node
and y
represent the outgoing block membership of a node.
To generate x
we sample from a categorical
distribution with parameter \pi_out
.
To generate y
we sample from a categorical
distribution with parameter \pi_in
.
Block memberships are independent across nodes. Incoming and outgoing
block memberships of the same node are also independent.
Degree heterogeneity
In addition to block membership, the DCSBM also
nodes to have different propensities for incoming and
outgoing edge formation.
We represent the propensity to form incoming edges for a
given node by a positive number \theta_out
.
We represent the propensity to form outgoing edges for a
given node by a positive number \theta_in
.
Typically the \theta_out
(and theta_in
) across all nodes are
constrained to sum to one for identifiability purposes,
but this doesn't really matter during sampling.
Edge formulation
Once we know the block memberships x
and y
and the degree heterogeneity parameters \theta_{in}
and
\theta_{out}
, we need one more
ingredient, which is the baseline intensity of connections
between nodes in block i
and block j
. Then each edge forms
independently according to a Poisson distribution with
parameters
\lambda = \theta_{in} * B_{x, y} * \theta_{out}.
See Also
Other stochastic block models:
dcsbm()
,
mmsbm()
,
overlapping_sbm()
,
planted_partition()
,
sbm()
Other directed graphs:
directed_erdos_renyi()
Examples
set.seed(27)
B <- matrix(0.2, nrow = 5, ncol = 8)
diag(B) <- 0.9
ddcsbm <- directed_dcsbm(
n = 1000,
B = B,
k_out = 5,
k_in = 8,
expected_density = 0.01
)
ddcsbm
population_svd <- svds(ddcsbm)