SeedMatchR {SeedMatchR} | R Documentation |
Find seed matches in genomic features
Description
Find seed matches in a DNAStringSet
object of
sequences. This function will use get.seed
extract the seed sequence
from the guide sequence. The seed is then searched across all rows of the
DNAStringSet
object using vpatterncount
.
This function returns the input DESeq2 results data.frame
with an
additional column that contains the counts for the input seed.name
.
Usage
SeedMatchR(
res,
gtf,
seqs,
sequence,
seed.name = "mer7m8",
col.name = NULL,
mismatches = 0,
indels = FALSE,
tx.id.col = TRUE
)
Arguments
res |
A DESeq2 results |
gtf |
GTF file used to map features to genes. The object must have columns transcript_id and gene_id |
seqs |
The |
sequence |
The |
seed.name |
The name of specific seed to extract. Options are: mer8, mer7A1, mer7m8, mer6 |
col.name |
The string to use for the column name. Defaults to seed name |
mismatches |
The number of mismatches to allow in search |
indels |
Whether to allow indels in search |
tx.id.col |
Use the transcript_id column instead of gene_id |
Value
A modified DESeq2 results dataframe that has column named after the seed of choice representing the number of match counts.
Examples
library(dplyr)
seq = "UUAUAGAGCAAGAACACUGUUUU"
anno.db = load_species_anno_db("human")
features = get_feature_seqs(anno.db$tx.db, anno.db$dna)
# Load test data
res <- Schlegel_2022_Ttr_D1_30mkg
# Filter DESeq2 results for SeedMatchR
res = filter_deseq(res, fdr.cutoff=1, fc.cutoff=0, rm.na.log2fc = TRUE)
res = SeedMatchR(res, anno.db$gtf, features$seqs, seq, "mer7m8")