get_sig_similarity {sigminer} | R Documentation |
Calculate Similarity between Identified Signatures and Reference Signatures
Description
The reference signatures can be either a Signature
object specified by Ref
argument
or known COSMIC signatures specified by sig_db
argument.
Two COSMIC databases are used for comparisons - "legacy" which includes 30 signaures,
and "SBS" - which includes updated/refined 65 signatures. This function is modified
from compareSignatures()
in maftools package.
NOTE: all reference signatures are generated from gold standard tool:
SigProfiler.
Usage
get_sig_similarity(
Signature,
Ref = NULL,
sig_db = c("SBS", "legacy", "DBS", "ID", "TSB", "SBS_Nik_lab", "RS_Nik_lab",
"RS_BRCA560", "RS_USARC", "CNS_USARC", "CNS_TCGA", "CNS_TCGA176", "CNS_PCAWG176",
"SBS_hg19", "SBS_hg38", "SBS_mm9", "SBS_mm10", "DBS_hg19", "DBS_hg38", "DBS_mm9",
"DBS_mm10", "SBS_Nik_lab_Organ", "RS_Nik_lab_Organ", "latest_SBS_GRCh37",
"latest_DBS_GRCh37", "latest_ID_GRCh37", "latest_SBS_GRCh38", "latest_DBS_GRCh38",
"latest_SBS_mm9", "latest_DBS_mm9", "latest_SBS_mm10", "latest_DBS_mm10",
"latest_SBS_rn6", "latest_DBS_rn6", "latest_CN_GRCh37",
"latest_RNA-SBS_GRCh37", "latest_SV_GRCh38"),
db_type = c("", "human-exome", "human-genome"),
method = "cosine",
normalize = c("row", "feature"),
feature_setting = sigminer::CN.features,
set_order = TRUE,
pattern_to_rm = NULL,
verbose = TRUE
)
Arguments
Signature |
a |
Ref |
default is |
sig_db |
default 'legacy', it can be 'legacy' (for COSMIC v2 'SBS'),
'SBS', 'DBS', 'ID' and 'TSB' (for COSMIV v3.1 signatures)
for small scale mutations.
For more specific details, it can also be 'SBS_hg19', 'SBS_hg38',
'SBS_mm9', 'SBS_mm10', 'DBS_hg19', 'DBS_hg38', 'DBS_mm9', 'DBS_mm10' to use
COSMIC v3 reference signatures from Alexandrov, Ludmil B., et al. (2020) (reference #1).
In addition, it can be one of "SBS_Nik_lab_Organ", "RS_Nik_lab_Organ",
"SBS_Nik_lab", "RS_Nik_lab" to refer reference signatures from
Degasperi, Andrea, et al. (2020) (reference #2);
"RS_BRCA560", "RS_USARC" to reference signatures from BRCA560 and USARC cohorts;
"CNS_USARC" (40 categories), "CNS_TCGA" (48 categories) to reference copy number signatures from USARC cohort and TCGA;
"CNS_TCGA176" (176 categories) and "CNS_PCAWG176" (176 categories) to reference copy number signatures from PCAWG and TCGA separately.
UPDATE, the latest version of reference version can be automatically
downloaded and loaded from https://cancer.sanger.ac.uk/signatures/downloads/
when a option with |
db_type |
only used when |
method |
default is 'cosine' for cosine similarity. |
normalize |
one of "row" and "feature". "row" is typically used for common mutational signatures. "feature" is designed by me to use when input are copy number signatures. |
feature_setting |
a |
set_order |
if |
pattern_to_rm |
patterns for removing some features/components in similarity
calculation. A vector of component name is also accepted.
The remove operation will be done after normalization. Default is |
verbose |
if |
Value
a list
containing smilarities, aetiologies if available, best match and RSS.
Author(s)
Shixiang Wang w_shixiang@163.com
References
Alexandrov, Ludmil B., et al. "The repertoire of mutational signatures in human cancer." Nature 578.7793 (2020): 94-101.
Degasperi, Andrea, et al. "A practical framework and online tool for mutational signature analyses show intertissue variation and driver dependencies." Nature cancer 1.2 (2020): 249-263.
Steele, Christopher D., et al. "Undifferentiated sarcomas develop through distinct evolutionary pathways." Cancer Cell 35.3 (2019): 441-456.
Nik-Zainal, Serena, et al. "Landscape of somatic mutations in 560 breast cancer whole-genome sequences." Nature 534.7605 (2016): 47-54.
Steele, Christopher D., et al. "Signatures of copy number alterations in human cancer." Nature 606.7916 (2022): 984-991.
Examples
# Load mutational signature
load(system.file("extdata", "toy_mutational_signature.RData",
package = "sigminer", mustWork = TRUE
))
s1 <- get_sig_similarity(sig2, Ref = sig2)
s1
s2 <- get_sig_similarity(sig2)
s2
s3 <- get_sig_similarity(sig2, sig_db = "SBS")
s3
# Set order for result similarity matrix
s4 <- get_sig_similarity(sig2, sig_db = "SBS", set_order = TRUE)
s4
## Remove some components
## in similarity calculation
s5 <- get_sig_similarity(sig2,
Ref = sig2,
pattern_to_rm = c("T[T>G]C", "T[T>G]G", "T[T>G]T")
)
s5
## Same to DBS and ID signatures
x1 <- get_sig_db("DBS_hg19")
x2 <- get_sig_db("DBS_hg38")
s6 <- get_sig_similarity(x1$db, x2$db)
s6