expectedMutations {shazam} | R Documentation |
Calculate expected mutation frequencies
Description
expectedMutations
calculates the expected mutation frequencies for each
sequence in the input data.frame
.
Usage
expectedMutations(
db,
sequenceColumn = "sequence_alignment",
germlineColumn = "germline_alignment",
targetingModel = HH_S5F,
regionDefinition = NULL,
mutationDefinition = NULL,
nproc = 1,
cloneColumn = "clone_id",
juncLengthColumn = "junction_length"
)
Arguments
db |
|
sequenceColumn |
|
germlineColumn |
|
targetingModel |
TargetingModel object. Default is HH_S5F. |
regionDefinition |
RegionDefinition object defining the regions
and boundaries of the Ig sequences. To use regions definitions,
sequences in |
mutationDefinition |
MutationDefinition object defining replacement
and silent mutation criteria. If |
nproc |
|
cloneColumn |
clone id column name in |
juncLengthColumn |
junction length column name in |
Details
Only the part of the sequences defined in regionDefinition
are analyzed.
For example, when using the IMGT_V definition, mutations in
positions beyond 312 will be ignored.
Value
A modified db
data.frame
with expected mutation frequencies
for each region defined in regionDefinition
.
The columns names are dynamically created based on the regions in
regionDefinition
. For example, when using the IMGT_V
definition, which defines positions for CDR and FWR, the following columns are
added:
-
mu_expected_cdr_r
: number of replacement mutations in CDR1 and CDR2 of the V-segment. -
mu_expected_cdr_s
: number of silent mutations in CDR1 and CDR2 of the V-segment. -
mu_expected_fwr_r
: number of replacement mutations in FWR1, FWR2 and FWR3 of the V-segment. -
mu_expected_fwr_s
: number of silent mutations in FWR1, FWR2 and FWR3 of the V-segment.
See Also
calcExpectedMutations is called by this function to calculate the expected mutation frequencies. See observedMutations for getting observed mutation counts. See IMGT_SCHEMES for a set of predefined RegionDefinition objects.
Examples
# Subset example data
data(ExampleDb, package="alakazam")
db <- subset(ExampleDb, c_call %in% c("IGHA", "IGHG") & sample_id == "+7d")
set.seed(112)
db <- dplyr::slice_sample(db, n=100)
# Calculate expected mutations over V region
db_exp <- expectedMutations(db,
sequenceColumn="sequence_alignment",
germlineColumn="germline_alignment_d_mask",
regionDefinition=IMGT_V,
nproc=1)
# Calculate hydropathy expected mutations over V region
db_exp <- expectedMutations(db,
sequenceColumn="sequence_alignment",
germlineColumn="germline_alignment_d_mask",
regionDefinition=IMGT_V,
mutationDefinition=HYDROPATHY_MUTATIONS,
nproc=1)