SampleID_vs_NumSequences {GaMaBioMD}R Documentation

Plots the number of sequences and the probability of sequences per SampleID.

Description

This function takes a data frame with 'SampleID' and 'SequenceID' columns and creates two bar plots. The first plot shows the number of sequences per SampleID, and the second plot shows the probability of sequences per SampleID.

Usage

SampleID_vs_NumSequences(final_data)

Arguments

final_data

A data frame with 'SampleID' and 'SequenceID' columns.

Value

A list containing two ggplot2 plots: 'plot_num_sequences' and 'plot_prob'.

Examples


accession_ranges <- list(
  SRU1 = "AJ240966 to AJ240970",
  STU2 = "AB015240 to AB015245",
  WPU13 = "L11934 to L11939",
  INU20 = c("AF277467 to AF277470", "AF333080 to AF333085")
)

# Use the function to expand accession ranges
sam_acc <- expand_accession_ranges(accession_ranges)
print(sam_acc)

# 2 get_sequence_information
accessions_to_query <- sam_acc$accession
seq_info <- get_sequence_information(accessions_to_query, remove_dot_1 = TRUE)
print(seq_info)
result <- preprocess_for_alignment(sam_acc, seq_info)

# Access the resulting data frames
merged_data <- result$merged_data
main_data <- result$main_data
final_data <- result$final_data

# Example usage
plots <- SampleID_vs_NumSequences(final_data)

output_directory <- tempdir()
# Set the file name for the TIFF images
tiff_file_num_sequences <- file.path(output_directory, "0. SampleID_vs_NumSequences.tiff")
tiff_file_prob <- file.path(output_directory, "0. SampleID_vs_Probability.tiff")

# Set the width, height, and DPI parameters
width_inch <- 8
height_inch <- 8
dpi <- 300

# Open the TIFF devices and save the plots
tiff(tiff_file_num_sequences, width = width_inch, height = height_inch, units = "in", res = dpi)
print(plots$plot_num_sequences)
dev.off()

tiff(tiff_file_prob, width = width_inch, height = height_inch, units = "in", res = dpi)
print(plots$plot_prob)
dev.off()


[Package GaMaBioMD version 0.2.0 Index]