preprocess_for_alignment {GaMaBioMD} | R Documentation |
Preprocesses data for sequence alignment.
Description
This function merges sample and accession information with sequence information, filters out rows with missing sequences, and extracts relevant columns for the final data.
Usage
preprocess_for_alignment(sam_acc, seq_info)
Arguments
sam_acc |
A data frame containing sample and accession information. |
seq_info |
A data frame containing sequence information. |
Value
A list containing the resulting data frames: 'merged_data', 'main_data', 'final_data'.
Examples
accession_ranges <- list(
SRU1 = "AJ240966 to AJ240970",
STU2 = "AB015240 to AB015245",
WPU13 = "L11934 to L11939",
INU20 = c("AF277467 to AF277470", "AF333080 to AF333085")
)
# Use the function to expand accession ranges
sam_acc <- expand_accession_ranges(accession_ranges)
print(sam_acc)
# 2 get_sequence_information
accessions_to_query <- sam_acc$accession
seq_info <- get_sequence_information(accessions_to_query, remove_dot_1 = TRUE)
print(seq_info)
result <- preprocess_for_alignment(sam_acc, seq_info)
# Access the resulting data frames
merged_data <- result$merged_data
main_data <- result$main_data
final_data <- result$final_data
[Package GaMaBioMD version 0.2.0 Index]