R: Covariate Assisted Spectral Clustering.

CAclustering {NAC}

R Documentation

Covariate Assisted Spectral Clustering.

Description

CAclustering clusters graph nodes by applying spectral clustering with the assistance from node covariates.

Usage

CAclustering(Adj, Covariate, K, alphan = 5, itermax = 100, startn = 10)

Arguments

`Adj`	An `n \times n` symmetric adjacency matrix with diagonals being `0` and positive entries being `1`.
`Covariate`	An `n \times p` covariate matrix. The rows correspond to nodes and the columns correspond to covariates.
`K`	A positive integer which is no larger than `n`. This is the predefined number of communities.
`alphan`	The number of candidate `\alpha`'s to try within the range `(\alpha_{min}, \alpha_{max})` given in Binkiewicz et al. (2017). An optimal `\alpha` is expected to achieve a balance between `L_{\tau}` and `X`.
`itermax`	`k-means` parameter, indicating the maximum number of iterations allowed. The default value is 100.
`startn`	`k-means` parameter. The number of times the algorithm should be run with different initial centroids. The default value is 10.

Details

CAclustering is an algorithm designed for community detection in networks with node covariates, as introduced in the paper Covariate-assisted spectral clustering of Binkiewicz et al. (2017). CAclustering applies k-means on the first K leading eigenvectors of L_{\tau}+\alpha XX^{\prime}, where L_{\tau} is the regularized graph Laplacian, X is the covariates matrix, and \alpha is a tuning parameter.

Value

estall

A factor indicating nodes' labels. Items sharing the same label are in the same community.

References

Binkiewicz, N., Vogelstein, J. T., & Rohe, K. (2017). Covariate-assisted spectral clustering. Biometrika, 104(2), 361-377.
doi:10.1093/biomet/asx008

Examples


# Simulate the Network
n = 10; K = 2; p =5; prob1 = 0.9;
theta = 0.4 + (0.45-0.05)*(seq(1:n)/n)^2; Theta = diag(theta);
P  = matrix(c(0.8, 0.2, 0.2, 0.8), byrow = TRUE, nrow = K)
set.seed(2022)
l = sample(1:K, n, replace=TRUE); # node labels
Pi = matrix(0, n, K) # label matrix
for (k in 1:K){
  Pi[l == k, k] = 1
}
Omega = Theta %*% Pi %*% P %*% t(Pi) %*% Theta;
Adj = matrix(runif(n*n, 0, 1), nrow = n);
Adj = Omega - Adj;
Adj = 1*(Adj >= 0)
diag(Adj) = 0
Adj[lower.tri(Adj)] = t(Adj)[lower.tri(Adj)]
Q = 0.1*matrix(sign(runif(p*K) - 0.5), nrow = p);
for(i in 1:K){
  Q[(i-1)*(p/K)+(1:(p/K)), i] = 0.3; #remark. has a change here
}
W = matrix(0, nrow = n, ncol = K);
for(jj in 1:n) {
  pp = rep(1/(K-1), K); pp[l[jj]] = 0;
  if(runif(1) <= prob1) {W[jj, 1:K] = Pi[jj, ];}
  else
  W[jj, sample(K, 1, prob = pp)] = 1;
  }
W = t(W)
D0 = Q %*% W
D = matrix(0, n, p)
for (i in 1:n){
  D[i,] = rnorm(p, mean = D0[,i], sd = 1);
}
CAclustering(Adj, D, 2)

[Package NAC version 0.1.0 Index]