oligoProfile {spgs} | R Documentation |
Oligo Profiles and Oligo Profile Correlation Plots of Nucleotide Sequences
Description
Construct a k-mer oligo profile of a nucleotide sequence and print such a profile or its reverse complement. There is also a plot function for producing plots of the profile or its reverse complement and for comparing primary and complementary strand profiles.
Usage
oligoProfile(x, k, content=c("dna", "rna"),
case=c("lower", "upper", "as is"), circular=TRUE, disambiguate=TRUE,
plot=TRUE, ...)
## S3 method for class 'OligoProfile'
plot(x, which=1L, units=c("percentage", "count", "proportion"),
main=NULL, xlab=NULL, ylab=NULL, ...)
## S3 method for class 'OligoProfile'
print(x, which=1L, units=c("percentage", "count", "proportion"),
digits=switch(units, percentage=3L, count=NULL, proportion=3L), ...)
Arguments
x |
a character vector or an object that can be coersed to a character vector. |
k |
the k-mer profile to produce. |
content |
The content type (“ |
case |
determines how labels for the array should be generated: in lowercase, in uppercase or left as is, in which case labels such as “b” and “B” will be seen as distinct symbols and counted separately. |
circular |
Determines if the vector should be treated as circular or not. The default is
|
disambiguate |
if set to the default of |
plot |
should a plot of the profile be produced? The default is |
which |
For For the the |
units |
The oligo profiles can be scaled according to three different units for
presentation on plots: “ |
main |
The title of the plot. See |
xlab |
a label for the x-axis of the plot. See |
ylab |
a label for the y-axis of the plot. See |
digits |
The number of significant digits to print. The default is |
... |
arguments to be passed from or to other functions |
Details
This function returns the oligo profile for a sequence in an OligoProfile
object, which is printed on screen if the plot
parameter is FALSE
.
An oligo profile is simply the counts of all k
-mers in a sequence for
some specified value of k
.
By default, oligoProfile
produces a plot of the oligo profile expressed
in terms of percentages. The plot
argument determines if the plot
should be generated or not and plotting parameters such as main
,
sub
, etc., may be passed as arguments to the function when plot
is
TRUE
.
The plot
method, either called directly or indirectly via the
oligoProfile
function, can produce either the oligo profile of x
(which = 1
), the oligo profile of its reverse complement (which =
2
), or an interstrand k-mer correlation plot comparing the k-oligo profile
ofx
with that of its reverse complement (which = 3)
. Such
Correlation plots effectively show the relationship between k-mers on the primary and complementary strands in a DNA duplex and can be used to assess compliance with CSPR. More precisely, one would conclude that a genomic sequence complies with CSPR if all the plotted points lie on a diagonal line running from the bottom-left corner to the top-right corner of the graph.
Value
A list with class “OligoProfile” containing the following components:
name |
a name to identify the source of the profile. |
wordLength |
the value of k used to derive the k-mer profile. |
content |
indicates if the profile pertains to a DNA or RNA sequence. |
case |
indicates how the case of letters was processed before producing the profile. |
circular |
indicates whether or not the sequence was considered circular for the purpose of producing the profile. |
disambiguate |
indicates if the sequence was made unambiguous before producing the profile. |
profile |
a vector containing the raw counts (frequencies) of all k-mers. |
Author(s)
Andrew Hart and Servet MartÃnez
References
Albrecht-Buehler, G. (2006) Asymptotically increasing compliance of genomes with Chargaff's second parity rules through inversions and inverted transpositions. PNAS 103(47), 17828–17833.
See Also
pair.counts
, triple.counts
,
quadruple.counts
, cylinder.counts
,
array2vector
, table2vector
, disambiguate
Examples
data(nanoarchaeum)
#Get the 3-oligo profile of Nanoarchaeum without plotting it
nano.prof <- oligoProfile(nanoarchaeum, 3, plot=FALSE)
nano.prof #print oligo profile as percentages
print(nano.prof, units="count") #print oligo profile as counts
plot(nano.prof) #oligo profile plotted as percentages
plot(nano.prof, units="count") #plot it as counts
#plot the 2-oligo profile of Nanoarchaeum as proportions
oligoProfile(nanoarchaeum, k=3, units="proportion")