chargaff.gibbs.test {spgs} | R Documentation |
Test of CSPR for Dinucleotides Under Gibbs Distribution
Description
Performs a test of Chargaff's second parity rule (CSPR) for dinucleotides under a Gibbsian assumption on the DNA sequence, which was proposed in Hart and Martínez (2012).
Usage
chargaff.gibbs.test(x, maxLag=200)
Arguments
x |
either a character vector representing a DNA sequence in which each element contains a single nucleotide, or a DNA sequence stored using the SeqFastadna class from the seqinr package. |
maxLag |
The maximum number of lags (cylinder lengths) to use in computing variances. the default value is ‘200’. |
Details
This function performs a test of Chargaff's second parity rule for dinucleotides
assuming the DNA sequence was generated by a Gibbs distribution. Under the null
hypothesis, the test statistic \eta
is asymptotically
\chi^2
on 5 degrees of freedom.
The test is set up as follows:
H_0
: the sequence complies with CSPR for dinucleotides
H_1
: the sequence does not comply with CSPR for dinucleotides
Value
A list with class "htest" containing the following components:
statistic |
the value of the test statistic. |
p.value |
the p-value of the test. |
method |
a character string indicating what type of test was performed. |
data.name |
a character string giving the name of the data. |
FHat |
the 5-element vector |
pairs |
the stochastic matrix of dinucleotide counts used to derive |
v |
The asymptotic covariance matrix of |
n |
the length of the DNA sequence. |
cutoff |
the actual number of lags used by the algorithm to calculate covariances. |
maxCutoff |
the value specified for the maxLag parameter when the test was performed. |
Author(s)
Andrew Hart and Servet Martínez
References
Hart, A.G. and Martínez, S. (2012) A Gibbs approach to Chargaff's second parity rule. J. Stat. Phys. 146(2), 408-422.
See Also
chargaff0.test
, chargaff1.test
,
chargaff2.test
, agct.test
,
ag.test
Examples
#Demonstration on real bacterial sequence
data(nanoarchaeum)
chargaff.gibbs.test(nanoarchaeum)
#Simulate synthetic DNA sequence that does not satisfy Chargaff's second parity rule
trans.mat <- matrix(c(.4, .1, .4, .1, .2, .1, .6, .1, .4, .1, .3, .2, .1, .2, .4, .3),
ncol=4, byrow=TRUE)
seq <- simulateMarkovChain(500000, trans.mat, states=c("a", "c", "g", "t"))
chargaff.gibbs.test(seq)