R: Variants annotation based on regions and subregions positions

set.genomic.region.subregion {Ravages}

R Documentation

Variants annotation based on regions and subregions positions

Description

Attributes regions and subregions to variants based on given positions

Usage

set.genomic.region.subregion(x, regions, subregions, split = TRUE)

Arguments

`x`	A bed.matrix
`regions`	A dataframe in bed format (start is 0-based and end is 1-based) containing the regions with the fields : `Chr` (the chromosome of the gene), `Start` (the start position of the gene, 0-based), `End` (the end position of the gene, 1-based), and `Name` (the name of the gene - a factor),
`subregions`	A dataframe containing the subregions in the same format as `regions`
`split`	Whether to split variants attributed to multiple regions by duplicating this variants, set at TRUE by default

Details

Warnings: regions$Name and subregions$Name should be factors containing UNIQUE names of the regions, ORDERED in the genome order.

If x@snps$chr is not a vector of integers, it should be a factor with same levels as regions$Chr.

If a variant is attributed to multiple genomic regions, it will be duplicated in the bed matrix with one row per genomic region if split = TRUE.

This function can be applied before using burden.subscores to perform a functionally-informed burden tests with sub-scores for each SubRegion within each genomic.region.

Value

The same bed matrix as x with two additional columns: x@snps$genomic.region containing the annotation of the regions and x@snps$SubRegion containing the annotation of the subregions.

Examples

#Import 1000Genome data from region around LCT gene
x <- as.bed.matrix(LCT.gen, LCT.fam, LCT.bim)

#Group variants within known genes and 
#Within coding and regulatory regions
x <- set.genomic.region.subregion(x, 
 regions = genes.b37, subregions = subregions.LCT)

[Package Ravages version 1.1.3 Index]