sim_dTBM {dTBM}R Documentation

Simulation of degree-corrected tensor block models

Description

Generate order-3 tensor/matrix observations with degree heterogeneity under degree-corrected tensor block models.

Usage

sim_dTBM(
  seed = NA,
  imat = FALSE,
  asymm = FALSE,
  p,
  r,
  core_control = c("random", "control"),
  delta = NULL,
  s_min = NULL,
  s_max = NULL,
  dist = c("normal", "binary"),
  sigma = 1,
  theta_dist = c("abs_normal", "pareto", "non"),
  alpha = NULL,
  beta = NULL
)

Arguments

seed

number, random seed for generating data

imat

logic variable, if "TRUE", generate matrix data; if "FALSE", generate order-3 tensor data

asymm

logic variable, if "TRUE", clustering assignment differs in different modes; if "FALSE", all the modes share the same clustering assignment

p

vector, dimension of the tensor/matrix observation

r

vector, cluster number on each mode

core_control

character, the way to control the generation of core tensor/matrix; see "details"

delta

number, Frobenius norm of the slices in core tensor if core_control = "control"

s_min

number, value of off-diagonal elements in original core tensor/matrix if core_control = "control"

s_max

number, value of diagonal elements in original core tensor/matrix if core_control = "control"

dist

character, distribution of tensor/matrix observation; see "details"

sigma

number, standard deviation of Gaussian noise if dist = "normal"

theta_dist

character, distribution of degree heterogeneity; see "details"

alpha

number, shape parameter in pareto distribution if theta_dist = "pareto"

beta

number, scale parameter in pareto distribution if theta_dist = "pareto"

Details

The general tensor observation is generated as

Y = S x1 Theta1 M1 x2 Theta2 M2 x3 Theta3 M3 + E,

where S is the core tensor, Thetak is a diagonal matrix with elements in the k-th vector of theta, Mk is the membership matrix based on the clustering assignment in the k-th vector of z with r[k] clusters, E is the mean-zero noise tensor, and xk refers to the matrix-by-tensor product on the k-th mode, for k = 1,2,3.

If imat = TRUE, Y,S,E degenerate to matrix and Y = Theta1 M1 S M2^T Theta2^T + E.

If asymm = FALSE, Thetak = Theta and Mk = M for all k = 1,2,3.

core_control specifies the way to generate S:

If core_control = "control", first generate S as a diagonal tensor/matrix with diagonal elements s_max and off-diagonal elements s_min; then scale the original core such that Frobenius norm of the slices equal to delta, i.e, delta = sqrt(sum(S[1,,]^2)) or delta = sqrt(sum(S[1,]^2)); ignore the scaling if delta = NULL; option "control" is only applicable for symmetric case asymm = FALSE.

If core_control = "random", generate S with random entries following uniform distribution U(0,1).

dist specifies the distribution of E: "normal" for Gaussian and "binary" for Bernoulli distribution; sigma specifices the standard deviation if dist = "normal".

theta_dist firstly specifies the distribution of theta: "non" for constant 1, "abs_normal" for absoulte normal distribution, "pareto" for pareto distribution; alpha, beta specify the shape and scale parameter if theta_dist = "pareto"; then scale theta to have mean equal to one in each cluster.

Value

a list containing the following:

Y array ( if imat = FALSE )/matrix ( if imat = TRUE ), simulated tensor/matrix observations with dimension p

X array ( if imat = FALSE )/matrix ( if imat = TRUE ), mean tensor/matrix of the observation, i.e., the expectation of Y

S array ( if imat = FALSE )/matrix ( if imat = TRUE ), core tensor/matrix recording the block effects with dimension r

theta a list of vectors, degree heterogeneity on each mode

z a list of vectors, clustering assignment on each mode

Examples


test_data = sim_dTBM(seed = 1, imat = FALSE, asymm = FALSE, p = c(50,50,50), r = c(3,3,3),
                    core_control = "control", s_min = 0.05, s_max = 1,
                    dist = "normal", sigma = 0.5,
                    theta_dist = "pareto", alpha = 4, beta = 3/4)

[Package dTBM version 3.0 Index]