NAC {NAC} | R Documentation |
Spectral Clustering on Network-Adjusted Covariates.
Description
Using network-adjusted covariates to detect underlying communities.
Usage
NAC(Adj, Covariate, K, alpha = NULL, beta = 0, itermax = 100, startn = 10)
Arguments
Adj |
An |
Covariate |
An |
K |
A positive integer which is no larger than |
alpha |
An optional numeric vector to tune the weight of covariate matrix. The default value is
|
beta |
An optional parameter used when the covariate matrix |
itermax |
|
startn |
|
Details
Spectral Clustering Network-Adjusted Covariates (NAC) is fully established in
Network-Adjusted Covariates for Community Detection
of Hu & Wang (2023). This method is particularly effective in the analysis of multiscale networks with
covariates, addressing the challenge of misspecification between networks and covariates. NAC
relies on the construction of network-adjusted
covariate vectors y_i = \alpha_ix_i + \sum_{j: A_{ij}=1}x_j, i \in 1, \cdots, n
, where the first part has
the nodal covariate information and the second part conveys network information.
By constructing Y = (y_1, \cdots, y_n)^{\prime} = AX + D_{\alpha}X
where A
is the adjacency matrix,
X
is the covariate matrix, and D_{\alpha}
is the diagonal matrix with diagonals
as \alpha_1, \cdots, \alpha_n
, NAC
applies K-means
on the first K
normalized left
singular vectors, treating each row as a data point.
A notable feature of NAC
is its tuning-free nature, where node-specific coefficient \alpha_i
is
computed given the i
-th node's degree. NAC
allows for
user-specified \alpha_i
as well. A generalization with uninformative covariates is considered by adjusting
parameter \beta
. As long as the covariates do provide information, the specification of \beta
can be
ignored.
Value
estall |
A factor indicating nodes' labels. Items sharing the same label are in the same community. |
References
Hu, Y., & Wang, W. (2023). Network-Adjusted Covariates for Community Detection.
arXiv preprint arXiv:2306.15616.
https://arxiv.org/abs/2306.15616
Examples
# Simulate the Network
n = 10; K = 2; p =5; prob1 = 0.9;
theta = 0.4 + (0.45-0.05)*(seq(1:n)/n)^2; Theta = diag(theta);
P = matrix(c(0.8, 0.2, 0.2, 0.8), byrow = TRUE, nrow = K)
set.seed(2022)
l = sample(1:K, n, replace=TRUE); # node labels
Pi = matrix(0, n, K) # label matrix
for (k in 1:K){
Pi[l == k, k] = 1
}
Omega = Theta %*% Pi %*% P %*% t(Pi) %*% Theta;
Adj = matrix(runif(n*n, 0, 1), nrow = n);
Adj = Omega - Adj;
Adj = 1*(Adj >= 0)
diag(Adj) = 0
Adj[lower.tri(Adj)] = t(Adj)[lower.tri(Adj)]
Q = 0.1*matrix(sign(runif(p*K) - 0.5), nrow = p);
for(i in 1:K){
Q[(i-1)*(p/K)+(1:(p/K)), i] = 0.3; #remark. has a change here
}
W = matrix(0, nrow = n, ncol = K);
for(jj in 1:n) {
pp = rep(1/(K-1), K); pp[l[jj]] = 0;
if(runif(1) <= prob1) {W[jj, 1:K] = Pi[jj, ];}
else
W[jj, sample(K, 1, prob = pp)] = 1;
}
W = t(W)
D0 = Q %*% W
D = matrix(0, n, p)
for (i in 1:n){
D[i,] = rnorm(p, mean = D0[,i], sd = 1);
}
NAC(Adj, D, 2)