bipartite_rank {birankr} | R Documentation |
Bipartite Ranks
Description
Estimate bipartite ranks (centrality scores) of nodes from an edge list or adjacency matrix. Functions as a wrapper for estimating rank based on a number of normalizers (algorithms) including HITS, CoHITS, BGRM, and BiRank. Returns a vector of ranks or (optionally) a list containing a vector for each mode. If the provided data is an edge list, this function returns ranks ordered by the unique values in the supplied edge list.
Usage
bipartite_rank(
data,
sender_name = NULL,
receiver_name = NULL,
weight_name = NULL,
rm_weights = FALSE,
duplicates = c("add", "remove"),
normalizer = c("HITS", "CoHITS", "BGRM", "BiRank"),
return_mode = c("rows", "columns", "both"),
return_data_frame = TRUE,
alpha = 0.85,
beta = 0.85,
max_iter = 200,
tol = 1e-04,
verbose = FALSE
)
Arguments
data |
Data to use for estimating rank. Must contain bipartite graph data, either formatted as an edge list (class data.frame, data.table, or tibble (tbl_df)) or as an adjacency matrix (class matrix or dgCMatrix). |
sender_name |
Name of sender column. Parameter ignored if data is an adjacency matrix. Defaults to first column of edge list. |
receiver_name |
Name of sender column. Parameter ignored if data is an adjacency matrix. Defaults to the second column of edge list. |
weight_name |
Name of edge weights. Parameter ignored if data is an adjacency matrix. Defaults to edge weights = 1. |
rm_weights |
Removes edge weights from graph object before estimating rank. Parameter ignored if data is an edge list. Defaults to FALSE. |
duplicates |
How to treat duplicate edges if any in data. Parameter ignored if data is an adjacency matrix. If option "add" is selected, duplicated edges and corresponding edge weights are collapsed via addition. Otherwise, duplicated edges are removed and only the first instance of a duplicated edge is used. Defaults to "add". |
normalizer |
Normalizer (algorithm) used for estimating node ranks (centrality scores). Options include HITS, CoHITS, BGRM, and BiRank. Defaults to HITS. |
return_mode |
Mode for which to return ranks. Defaults to "rows" (the first column of an edge list). |
return_data_frame |
Return results as a data frame with node names in the first column and ranks in the second column. If set to FALSE, the function just returns a named vector of ranks. Defaults to TRUE. |
alpha |
Dampening factor for first mode of data. Defaults to 0.85. |
beta |
Dampening factor for second mode of data. Defaults to 0.85. |
max_iter |
Maximum number of iterations to run before model fails to converge. Defaults to 200. |
tol |
Maximum tolerance of model convergence. Defaults to 1.0e-4. |
verbose |
Show the progress of this function. Defaults to FALSE. |
Details
For information about the different normalizers available in this function, see the descriptions for the HITS, CoHITS, BGRM, and BiRank functions. However, below outlines the key differences between the normalizers, with K_d
and K_p
representing diagonal matrices with generalized degrees (sum of the edge weights) on the diagonal (e.g. (K_d)_{ii} = \sum_j w_{ij}
and (K_p)_{jj} = \sum_i w_{ij}
).
Transition matrix | S_p | S_d |
--------------------- | --------------------- | --------------------- |
HITS | W^T | W |
Co-HITS | W^T K_d^{-1} | W K_p^{-1} |
BGRM | K_p^{-1} W^T K_d^{-1} | K_d^{-1} W K_p^{-1} |
BiRank | K_p^{-1/2} W^T K_d^{-1/2} | K_d^{-1/2} W K_p^{-1/2}
|
Value
A dataframe containing each node name and node rank. If return_data_frame changed to FALSE or input data is classed as an adjacency matrix, returns a vector of node ranks. Does not return node ranks for isolates.
Examples
#create edge list between patients and providers
df <- data.table(
patient_id = sample(x = 1:10000, size = 10000, replace = TRUE),
provider_id = sample(x = 1:5000, size = 10000, replace = TRUE)
)
#estimate CoHITS ranks
CoHITS <- bipartite_rank(data = df, normalizer = "CoHITS")