msc.length {rKOMICS} | R Documentation |
Length of minicircles
Description
The msc.length function allows you to check the length of minicircle sequences based on a single FASTA file. This function helps determine the size distribution of minicircle sequences.
Usage
msc.length(file, samples, groups)
Arguments
file |
the name of the FASTA file that contains all the minicircle sequences. The file should be in the format "all.minicircles.circ.fasta". |
samples |
a character vector containing the sample names. |
groups |
a vector of the same length as the samples, specifying the groups (e.g., subspecies) to which the samples belong. |
Value
length |
a numerical vector containing the lengths of the minicircle sequences. Each element corresponds to the length of a specific minicircle sequence. |
plot |
a histogram that visualizes the frequency distribution of minicircle sequence lengths. The histogram provides an overview of the length distribution of the minicircles. |
Examples
require(ggplot2)
require(ggpubr)
### run function
bf <- msc.length(file = system.file("extdata", "all.minicircles.fasta", package="rKOMICS"),
samples = exData$samples, groups = exData$subspecies)
af <- msc.length(file = system.file("extdata", "all.minicircles.circ.fasta", package="rKOMICS"),
samples = exData$samples, groups = exData$subspecies)
length(which(bf$length<800))
length(which(bf$length>1400))
### visualize results
hist(af$length, breaks=50)
### alter plot
ggarrange(bf$plot + labs(caption = "Before filtering"),
af$plot + labs(caption = "After filtering"), nrow=2)