sbfit {scaleboot} | R Documentation |
Fitting Models to Bootstrap Probabilities
Description
sbfit
is used to fit parametric models to multiscale bootstrap
probabilities by the maximum likelihood method.
Usage
sbfit(x, ...)
## Default S3 method:
sbfit(x,nb,sa,models=NULL,nofit=FALSE,bpm=NULL,sam=NULL,...)
## S3 method for class 'matrix'
sbfit(x,nb,sa,models=NULL,names.hp=rownames(x),
bpms=NULL,sam=NULL,nofit=FALSE,cluster=NULL,...)
## S3 method for class 'data.frame'
sbfit(x,...)
## S3 method for class 'scaleboot'
sbfit(x,models=names(x$fi),...)
## S3 method for class 'scalebootv'
sbfit(x,models=attr(x,"models"),...)
## S3 method for class 'scaleboot'
print(x,sort.by=c("aic","none"),...)
## S3 method for class 'scalebootv'
print(x,...)
Arguments
x |
an object used to select a method. For |
nb |
vector of numbers of bootstrap replicates. A short vector
(or scalar) is cyclically extended to match the size of |
sa |
vector of scales in sigma squared ( |
models |
character vector of model names. Valid model names are
|
nofit |
logical. If TRUE, fitting is not performed. |
bpm |
(experimental: bootstrap probabilities for 2-step bootstrap) |
sam |
(experimental: scales for 2-step bootstrap) |
bpms |
(experimental: bootstrap probabilities for 2-step bootstrap) |
names.hp |
character vector of hypotheses names. |
cluster |
parallel cluster object which may be generated by
function |
sort.by |
sort key. |
... |
further arguments passed to and from other methods. |
Details
sbfit.default
fits parametric models to bp
by maximizing the log-likelihood value of a binomial model.
A set of multiscale bootstrap resampling
should be performed before a call to sbfit
for preparing
bp
, where bp[i]
is a bootstrap probability of a
hypothesis calculated with a number of bootstrap
replicates nb[i]
and a scale \sigma^2
=sa[i]
.
The scale is defined as \sigma^2=n/n'
,
where n
is the sample size of data, and n'
is the sample
size of replicated data for bootstrap resampling.
Each model specifies a psi(beta,s)
=\psi(\sigma^2 | \beta)
function with a parameter vector \beta
. The model
may describe how the bootstrap probability changes along the scale.
Let cnt[i]=bp[i]*nb[i]
be the frequency indicating how many
times the hypothesis of interest is observed in bootstrap replicates
at scale sa[i]
. Then we assume that cnt[i]
is binomially
distributed with number of trials nb[i]
and success
probability 1-pnorm(psi(beta,s=sa[i])/sqrt(sa[i]))
. Currently,
sbpsi.poly
and sbpsi.sing
are available as \psi
functions. The estimated model parameters are accessed by the
coef.scaleboot
method.
The model fitting is performed in the order
specified in models
, and the initial values for numerical
optimization of the likelihood function are prepared by using
previously estimated model parameters. Thus, "poly.(m-1)"
should be specified before "poly.m", and "poly.(m-1)" and "sing.(m-1)"
should be specified before "sing.m".
sbfit.matrix
calls sbfit.default
repeatedly, once for each row
vector bp
of the matrix bps
. Parallel
computing is performed when cluster
is non NULL.
sbfit.scaleboot
calls sbfit.default
with bp
,
nb
, and sa
components in x
object for refitting by
giving another models
argument. It discards the previous result
of fitting, and recomputes the model parameters.
sbfit.scalebootv
calls sbfit.matrix
with the bps
,
nb
, and sa
components in the attributes of x
.
Value
sbfit.default
and sbfit.scaleboot
return an object of
class "scaleboot"
, and sbfit.matrix
and
sbfit.scalebootv
return an object of
class "scalebootv"
.
An object of class "scaleboot"
is a list containing at least the
following components:
bp |
the vector of bootstrap probabilities used. |
nb |
the |
sa |
the |
fi |
list vector of fitted results for |
An object of class "scalebootv"
is a vector of
"scaleboot"
objects, and in addition, it has attributes
"models"
, "bps"
, "nb"
, and "sa"
.
Author(s)
Hidetoshi Shimodaira <shimo@i.kyoto-u.ac.jp>
References
Shimodaira, H. (2002). An approximately unbiased test of phylogenetic tree selection, Systematic Biology, 51, 492-508.
Shimodaira, H. (2004). Approximately unbiased tests of regions using multistep-multiscale bootstrap resampling, Annals of Statistics, 32, 2616-2641.
Shimodaira, H. (2008). Testing Regions with Nonsmooth Boundaries via Multiscale Bootstrap, Journal of Statistical Planning and Inference, 138, 1227-1241. (http://dx.doi.org/10.1016/j.jspi.2007.04.001).
See Also
sbpsi
, summary.scaleboot
,
plot.scaleboot
, coef.scaleboot
,
sbaic
.
Examples
## Testing a hypothesis
## Examples of fitting models to a vector of bp's
## mam15.relltest$t4 of data(mam15), but
## using a different set of scales (sigma^2 values).
## In the below, sigma^2 ranges 0.01 to 100 in sa[i]
## This very large range is only for illustration.
## Typically, the range around 0.1 to 10
## is recommended for much better model fitting.
## In other examples, we have used
## sa = 9^seq(-1,1,length=13).
cnt <- c(0,0,0,0,6,220,1464,3565,5430,6477,6754,
6687,5961) # observed frequencies at scales
nb <- 100000 # number of replicates at each scale
bp <- cnt/nb # bootstrap probabilities (bp's)
sa <- 10^seq(-2,2,length=13) # scales (sigma squared)
## model fitting to bp's
f <- sbfit(bp,nb,sa) # model fitting ("scaleboot" object)
f # print the result of fitting
plot(f,legend="topleft") # observed bp's and fitted curves
## approximately unbiased p-values
summary(f) # calculate and print p-values
## refitting with models up to "poly.4" and "sing.4"
f <- sbfit(f,models=1:4)
f # print the result of fitting
plot(f,legend="topleft") # observed bp's and fitted curves
summary(f) # calculate and print p-values
## Not run:
## Testing multiple hypotheses (only two here)
## Examples of fitting models to vectors of bp's
## mam15.relltest[c("t1,t2")]
cnt1 <- c(85831,81087,76823,72706,67946,62685,57576,51682,
45887,41028,35538,31232,27832) # cnt for "t1"
cnt2 <- c(2,13,100,376,975,2145,3682,5337,7219,8559,
10069,10910,11455) # cnt for "t2"
cnts <- rbind(cnt1,cnt2)
nb <- 100000 # number of replicates at each scale
bps <- cnts/nb # row vectors are bp's
sa <- 9^seq(-1,1,length=13) # scales (sigma squared)
fv <- sbfit(bps,nb,sa) # returns a "scalebootv" object
fv # print the result of fitting
plot(fv) # multiple plots
summary(fv) # calculate and print p-values
## End(Not run)