seqCluster {immunarch} | R Documentation |
Function for assigning clusters based on sequences similarity
Description
Graph clustering based on distances between sequences
Usage
seqCluster(.data, .dist, .perc_similarity, .nt_similarity, .fixed_threshold)
Arguments
.data |
The data which was used to caluculate .dist object. Can be data.frame, data.table, or a list of these objects. Every object must have columns in the immunarch compatible format immunarch_data_format |
.dist |
List of distance objects produced with seqDist function. |
.perc_similarity |
Numeric value between 0 and 1 specifying the maximum acceptable weight of an edge in a graph. This threshold depends on the length of sequences. |
.nt_similarity |
Numeric between 0-sequence length specifying the threshold of allowing a 1 in n nucleotides mismatch in sequencies. |
.fixed_threshold |
Numeric specifying the threshold on the maximum weight of an edge in a graph. |
Value
Immdata data format object. Same as .data, but with extra 'Cluster' column with clusters assigned.
Examples
data(immdata)
# In this example, we will use only 2 samples with 500 clonotypes in each for time saving
input_data <- lapply(immdata$data[1:2], head, 500)
dist_result <- seqDist(input_data)
cluster_result <- seqCluster(input_data, dist_result, .fixed_threshold = 1)
[Package immunarch version 0.9.1 Index]