scaleboot {scaleboot} | R Documentation |
Multiscale Bootstrap Resampling
Description
Performs multiscale bootstrap resampling for a specified statistic.
Usage
scaleboot(dat,nb,sa,fun,parm=NULL,count=TRUE,weight=TRUE,
cluster=NULL,onlyboot=FALSE,seed=NULL,...)
countw.assmax(x,w,ass)
countw.shtest(x,w,obs)
countw.shtestass(x,w,assobs)
Arguments
dat |
data matrix or data-frame. Row vectors are to be resampled. |
nb |
vector of the numbers of bootstrap replicates. |
sa |
vector of scales in sigma squared ( |
fun |
function for a statistic. |
parm |
parameter to be passed to |
count |
logical. Should only the accumulative counts be returned? Otherwise, raw statistic vectors are returned. |
weight |
logical. In |
cluster |
parallel cluster object which may be generated by
function |
onlyboot |
logical. Should only bootstrap resampling be
performed? Otherwise, |
seed |
If non NULL, random seed is set. Specifying a seed is
particularly important when |
... |
further arguments passed to and from other methods. |
x |
data matrix or data-frame passed from |
w |
weight vector for resampling. |
ass |
a list of association vectors. An example of |
obs |
a vector of observed test statistics. An example of |
assobs |
a list of ass and obs above. An example of |
Details
These functions are used internally by relltest
for computing raw bootstrap probabilities of phylogenetic inference.
Alternatively, we used pvclust
to get raw bootstrap probabilities
of hierarchical clustering. In other cases, users may utilize
scaleboot
function or prepare their own functions.
scaleboot
performs multiscale bootstrap resampling for a
statistic defined by fun
, which should be one of the two
possible forms fun(x,w,parm)
and fun(x,i,parm)
. The former
is used when weight=TRUE
, and the weight
vector w
is generated by a multinomial distribution. The latter
is used when weight=FALSE
, and the index
vector i
is generated by resampling n'
elements from
\{1,...,n\}
. When count=TRUE
, fun
should return
a logical, or a vector of logicals.
Examples of fun(x,w,parm)
are countw.assmax
for AU p-values,
countw.shtest
for SH-test of trees, and countw.shtestass
for SH-test of both trees and edges. The definitions are given below.
countw.assmax <- function(x,w,ass) { y <- maxdif(wsumrow(x,w)) <= 0 # countw.max if(is.null(ass)) y else { z <- vector("logical",length(ass)) for(i in seq(along=ass)) z[i] <- any(y[ass[[i]]]) z } } countw.shtest <- function(x,w,obs) maxdif(wsumrow(x,w)) >= obs countw.shtestass <- function(x,w,assobs) unlist(assmaxdif(wsumrow(x,w),assobs$ass)) >= assobs$obs ### weighted sum of row vectors ## ## x = matrix (array of row vectors) ## w = weight vector (for rows) ## wsumrow <- function(x,w) { apply(w*x,2,sum)*nrow(x)/sum(w) } ### calc max diff ## ## y[i] := max_{j neq i} x[j] - x[i] ## maxdif <- function(x) { i1 <- which.max(x) # the largest element x <- -x + x[i1] x[i1] <- -min(x[-i1]) # the second largest value x } ### calc assmaxdif ## ## y[[i]][j] := max_{k neq ass[[i]]} x[k] - x[ass[[i]][j]] ## assmaxdif <- function(x,a) { y <- vector("list",length(a)) names(y) <- names(a) for(i in seq(along=a)) y[[i]] <- max(x[-a[[i]]]) - x[a[[i]]] y }
When count=TRUE
, the summation of outputs from fun
is
calculated. This gives the frequencies for how many times the
hypotheses are supported by the bootstrap replicates.
Value
If onlyboot=TRUE
, then a list of raw results from the multiscale bootstrap
resampling is returned. The components are "stat" for list vectors of
outputs from fun
(only when count=FALSE
), "bps" for a
matrix of multiscale bootstrap probabilities (only when
count=FALSE
), "nb" for the number of bootstrap replicates used,
and "sa" for the scales used. Note that scales are redefined by
sa <- nsize/round(nsize/sa)
, where nsize
is the sample size.
If onlyboot=FALSE
, then the result of a call to
sbfit
is returned when count=TRUE
, otherwise
the result of sbconf
is returned when count=FALSE
.
Author(s)
Hidetoshi Shimodaira
See Also
Examples
## Not run:
## An example to calculate AU p-values for phylogenetic trees
## See also the Examples of "sbconf"
data(mam15) # load mam15.mt
sa <- 9^seq(-1,1,length=13) # parameter for multiscale bootstrap
nb <- 1000 # nb=10000 is better but slower
# Now compute bootstrap probabilities and fit models to them
sim <- scaleboot(mam15.mt,nb,sa,countw.assmax) # takes some time (< 1 min)
sim # show bootstrap probabilities and model fitting
summary(sim) # show AU p-vaslues
## End(Not run)
## Not run:
## The following lines are only for illustration purpose
## a line from the definition of relltest
scaleboot(dat,nb,sa,countw.assmax,ass,cluster=cluster,
names.hp=na,nofit=nofit,models=models,seed=seed)
## two lines from rell.shtest (internal function)
scaleboot(z,nb,1,countw.shtest,tobs,cluster=cluster,
onlyboot=TRUE,seed=seed)
scaleboot(z,nb,1,countw.shtestass,pa,cluster=cluster,
onlyboot=TRUE,seed=seed)
## End(Not run)