shape.statistic {apTreeshape} | R Documentation |
shape.statistic
computes the logarithm of the ratio of the likelihoods under the Yule model and the PDA model of the given tree.
shape.statistic(tree, norm=NULL)
tree |
An object of class |
norm |
A character string equals to |
The log of the likelihood ratio is proportional to
sum(log(N(v)-1)),
for all internal node v (where N(v) is the number of internal nodes descending from the node v ). The ratio of the likelihoods enables to build the most powerful test of the Yule model against the PDA one. (Neyman-Pearson lemma).
Under the PDA model, the log ratio has approximate Gaussian distribution of mean ~ 2.03*n-3.545*sqrt(n-1) and variance ~ 2.45*(n-1)*log(n-1), where n is the number of tips of the tree. The Gaussian approximation is accurate for very large n (n greater than 10000(?)). The normalization of the ratio uses tabulated empirical values of variances estimated from Monte Carlo simulations. The normalization uses the formula:
variance ~ 1.570*n*log(n)-5.674*n+3.602*sqrt(n)+14.915
Under the Yule model, the log ratio has approximate Gaussian distribution of mean = 1.204*n-log(n-1)-2 and variance = 0.168*n-0.710, where n is the number of tips of the tree. The Gaussian approximation is accurate for n greater than 20.
An object of class numeric
containing the shape statistic of the tree.
Michael Blum <michael.blum@imag.fr>
Nicolas Bortolussi <nicolas.bortolussi@imag.fr>
Eric Durand <eric.durand@imag.fr>
Olivier Fran?ois <olivier.francois@imag.fr>
Fill, J. A. (1996), On the Distribution of Binary Search Trees under the Random Permutation Model. Random Structures and Algorithms, 8, 1 – 25, for more details about the normalization and proofs.
data(universal.treeshape) tree <- universal.treeshape plot(tree) summary(tree) likelihood.test(tree, model = "yule", alternative = "two.sided") likelihood.test(tree, model = "pda", alternative = "two.sided") ## Histogram of shape statistics for a list of Yule trees ## (may take some time to compute) main="Histogram of shape statistics"; xlab="shape statistic" hist(sapply(rtreeshape(100,tip.number=50,model="yule"),FUN=shape.statistic, norm="yule"), freq=FALSE, main=main, xlab=xlab) #Increase the numner of trees >100 for a better fit ## Does it fit the Gaussian distribution with mean=0 and sd=1 ? x<-seq(-3,3,by=0.001) lines(x,dnorm(x))