plotHit {hoardeR} | R Documentation |
Visualization of a cross-species hit
Description
For each cross-species hit the function plots the similarity within that area together with an optional annotation and coverage track.
Usage
plotHit(hits, flanking=1, window=NULL, annot=TRUE, coverage=FALSE,
smoothPara=NULL, diagonal=0.25, verbose=TRUE, output=FALSE,
hitSpecies=NULL, hitSpeciesAssembly=NULL, origSpecies=NULL,
origSpeciesAssembly=NULL, fastaFolder=NULL, origAnnot=NULL,
hitAnnot=NULL, nTick=5, which=NULL, figureFolder=NULL,
figurePrefix=NULL, indexOffset=0, bamFolder=NULL, bamFiles=NULL,
groupIndex=NULL, groupColor=NULL, countWindow=NULL)
Arguments
hits |
The hit object to be plotted. |
flanking |
Allowed flanking site in Mb. |
window |
Moving window size of similarity measure. |
annot |
Logical, add annotation track |
coverage |
Logical, add coverage track |
smoothPara |
Smoothing parameter for coverage |
diagonal |
Threshold for allowed diagonal similarity |
verbose |
Logical, shall the function give status updates |
output |
Logical, shall numerical results be given |
hitSpecies |
Scientific identifier of the hit species. |
hitSpeciesAssembly |
Version of the hit species assembly |
origSpecies |
Scientific name of the original species |
origSpeciesAssembly |
Version of the original species |
fastaFolder |
Location of the fasta files |
origAnnot |
Annotation object of the original species |
hitAnnot |
Annotation object of the hit species |
nTick |
Number of ticks on the annotation track |
which |
Which hits should be plotted |
figureFolder |
Folder where Figures should be stored |
figurePrefix |
Prefix of the figure filenames |
indexOffset |
Offset of the running index of the filenames |
bamFolder |
Folder with the bam-files |
bamFiles |
Filenames of the bam-files |
groupIndex |
Index of subgroups in the bamfiles |
groupColor |
Vector with colors, one for each subgroup |
countWindow |
Window size to count the reads from bam-files. |
Details
This function is the workhorse of hoardeR and visualizes the findings of the blast and intersection runs. It is really flexibel to handle the hits and
hence there are many different options. The required options are hits
, hitSpecies
, origSpecies
and fastaFolder
.
The hit object is an object as provided by intersectXMLAnnot
and contains all intersections of interest (=intersections that are in close
proximity of a gene in the hit species). Naturally the hit and the original species have to be specified as well as the folder, where the required fasta
files are stored, or to where they should be downloaded. If the species are the default species from Ensembl (as can be seen in the data.frame
species
), the annotation and assembly will be automatically downloaded to the specified location on the harddrive. Changes from that
version can be adjusted with the the hitSpeciesAssembly
and origSpeciesAssembly
options, but the filenames have still to match the convention, as they
are provided by NCBI.
If in addition to the similarity also a coverage track should be added, the option coverage
has to be set to TRUE
. The option
smoothPara
sets then the level of smoothing of the coverage. By default no smoothing will be applied.
In case an annotation track is requested (annot=TRUE
), the annotation objects need to be provided to the origAnnot
and hitAnnot
options.
The option diagonal
defines the minimum level of similarity so that a (diagonal) match will be plotted. The colors are then towards green for
total similarity and towards red for total disagree, based on a nucleotide mismatch matrix.
If the option verbose=TRUE
is set, the function gives a verbose output while running. Further, if output=TRUE
then, in addition to the
figure also a data.frame with the numerical results is provided.
In case that hits
contains more than one hit, the plotHit
function plots for each hit a figure. In that case a folder should be
provided to where the figures should be stored, this can be done with the figureFolder
and figurePrefix
options. In case only
asserted hits of hits
shall be plotted, they can be selected with the which
option.
The function can also plot a coverage track over the similarity. For that, the option coverage=TRUE
has to be set and a folder that
contains the necessary bam-files has to be specified in bamFolder
. By default all bam files in that folder are used, if only a subset
is requested, the filenames can be specified in bamFiles
. In case several bam-files are given, the average coverage at each loci is used.
Further, if the data contains subgroups (e.g. case/control), the vector groupIndex
gives the group labels. Naturally its length should be
similar to bamFiles
(or similar to the total amount of files in the bam-folder). In case that more than one group is plotted in the
coverage track, their colors can be defined in groupColor
. Of course, this vector has to be as long as the number of groups are defined.
The option countWindow
controls the moving window length in which the number of counts is calculated. The default is the same length as the
hit.
Value
Optional, a table with intersection loci.
Author(s)
Daniel Fischer
Examples
## Not run:
pigInter.flank <- list()
for(i in 1:nrow(pigHits)){
pigInter.flank[[i]] <- intersectXMLAnnot(pigHits[i,], ssannot, flanking=100)
}
# Basic usage:
plotHit(hits=pigInter.flank,
flanking=100,
hitSpecies = "Sus scrofa",
origSpecies = "Bos taurus",
fastaFolder = "/home/user/fasta/",
figureFolder = "/home/user/figures/")
# Annotation tracks added:
plotHit(hits=pigInter.flank,
flanking=100,
hitSpecies = "Sus scrofa",
origSpecies = "Bos taurus",
fastaFolder = "/home/user/fasta/",
figureFolder = "/home/user/figures/",
origAnnot=btannot,
hitAnnot=ssannot)
# Annotation and coverage added:
plotHit(hits=pigInter.flank,
flanking=100,
hitSpecies = "Sus scrofa",
origSpecies = "Bos taurus",
fastaFolder = "/home/daniel/fasta/",
figureFolder = "/home/user/figures/",
origAnnot=btannot,
hitAnnot=ssannot
coverage=TRUE,
bamFolder = "/home/users/bams/")
## End(Not run)