tgs_graph_cover_resample {tgstat} | R Documentation |
Clusters directed graph multiple times with randomized sample subset
Description
Clusters directed graph multiple times with randomized sample subset.
Usage
tgs_graph_cover_resample(
graph,
knn,
min_cluster_size,
cooling = 1.05,
burn_in = 10,
p_resamp = 0.75,
n_resamp = 500,
method = "hash"
)
Arguments
graph |
directed graph in the format returned by tgs_graph |
knn |
maximal number of edges used per node for each sample subset |
min_cluster_size |
used to determine the candidates for seeding (= min weight) |
cooling |
factor that is used to gradually increase the chance of a node to stay in the cluster |
burn_in |
number of node reassignments after which cooling is applied |
p_resamp |
fraction of total number of nodes used in each sample subset |
n_resamp |
number iterations the clustering is run on different sample subsets |
method |
method for calculating co_cluster and co_sample; valid values: "hash", "full", "edges" |
Details
The algorithm is explained in a "MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions" paper, published in "Genome Biology" #20: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1812-2
Value
If method == "hash", a list with two members. The first member is a data frame with 3 columns: "node1", "node2" and "cnt". "cnt" indicates the number of times "node1" and "node2" appeared in the same cluster. The second member of the list is a vector of number of nodes length reflecting how many times each node was used in the subset.
If method == "full", a list containing two matrices: co_cluster and co_sample.
If method == "edges", a list containing two data frames: co_cluster and co_sample.
See Also
Examples
# Note: all the available CPU cores might be used
set.seed(seed = 0)
rows <- 100
cols <- 200
vals <- sample(1:(rows * cols / 2), rows * cols, replace = TRUE)
m <- matrix(vals, nrow = rows, ncol = cols)
r1 <- tgs_cor(m, pairwise.complete.obs = FALSE, spearman = TRUE)
r2 <- tgs_knn(r1, knn = 20, diag = FALSE)
r3 <- tgs_graph(r2, knn = 3, k_expand = 10)
r4 <- tgs_graph_cover_resample(r3, 10, 1)