merge_replicates {vivaldi}R Documentation

merge_replicates

Description

Merges replicate VCF files into a single dataframe

Usage

merge_replicates(vardf, repdata, nameofrep1, nameofrep2, commoncols)

Arguments

vardf

Data frame of variants

repdata

Data frame of replicate information

nameofrep1

Name of variable representing the first replicate, must be written with quotes

nameofrep2

Name of variable representing the second replicate

commoncols

List of columns to merge the replicates by

Value

a data frame containing replicate information

Examples

df <- data.frame(sample = c("m1", "m2", "m1", "m2", "m1"),
                 CHROM = c("PB1", "PB1", "PB2", "PB2", "NP"),
                 POS = c(234, 234, 240, 240, 254),
                 REF = c("G", "G", "A", "A", "C"),
                 ALT = c("A", "A", "G", "G", "T"),
                 minorfreq = c(0.010, 0.022, 0.043, 0.055, 0.011),
                 majorfreq = c(0.990, 0.978, 0.957, 0.945, 0.989),
                 minorcount = c(7, 15, 26, 32, 7),
                 majorcount = c(709, 661, 574, 547, 610),
                 gt_DP = c(716, 676, 600, 579, 617)
)

# Dataframe shows a pair of replicates and their variants at 3 positions.
df

replicates <- data.frame(filename = c("m1","m2"),
                         replicate = c("rep1", "rep2"),
                         sample = c("a_2_iv", "a_2_iv")
)

# Dataframe showing relationship between filename, replicate, and sample name
replicates

# Merge by the following columns
cols = c("sample","CHROM","POS","REF","ALT")

merge_replicates(df, replicates, "rep1", "rep2", cols)
# The dataframe now contains the 2 variants at positions 234 & 240 that were
# detected in both sequencing replicates whereas the variant at position 254
# was only in a single replicate so it was removed during the merge.


[Package vivaldi version 1.0.1 Index]