stb.default {STB}R Documentation

Simultaneous Tolerance Bands (STB).

Description

Compute And/Or Plot Simultaneous Tolerance Bands for numeric vectors.

Usage

## Default S3 method:
stb(
  obj,
  N = 10000L,
  alpha = 0.05,
  rand.func = rnorm,
  tol = 1e-04,
  max.iter = 100L,
  algo = c("rank", "C", "R"),
  Ncpu = 1,
  q.type = 2L,
  stb.col = "#0000FF40",
  col.points = "black",
  col.out = "red",
  col.pwb = "#0000FF40",
  main = NULL,
  add.pwb = FALSE,
  quiet = FALSE,
  add = FALSE,
  plot = TRUE,
  legend = FALSE,
  timer = FALSE,
  pch = 16,
  pch.out = 16,
  seed = NULL,
  ...
)

Arguments

obj

(numeric) vector, which is supposed to be N(my, sigma^2)-distributed

N

(integer) value specifying the number of random samples to be used for constructing the STB

alpha

(numeric) value specifying the simultaneous tolerance level, i.e. 100(1-alpha)% of all 'N' random samples have to be completely enclosed by the bounds of the STB

rand.func

(function) a function which generates random samples, e.g. rand.func=rnorm which corresponds to random sampling from the standard normal distribution. Another example is defining func=function(n)rchisq(n=n, df=3, ncp=2) and using rand.func=func. See examples for further examples.

tol

(numeric) value specifying the max. acceptable deviation from 'alpha' used in the bisection algorithm

max.iter

(integer) value specifying the max. number of iteration for finding the bounds of the STB

algo

(character) (string) specifying the method to be used for constructing a 100(1-alpha)% STB, choose "rank" for the rank-based, "C" for a C-implementation of the quantile-based, and "R" for an R-implentation of the quantile-based algorithm (see details). "C" uses SAS PCTLDEF5 definition of quantiles, whereas "R" can use any of the built-in R types of quantiles (see quantile.

Ncpu

(integer) specifying the number cores/CPUs to be used, for N>1 multi-processing is applied

q.type

(integer) the quantile-type used if algo="R", see ? quantile for details.

stb.col

(character) string, a valid specification of a color to be used for the STB

col.points

(character) color for the points in the QQ-plot

col.out

(character) color for points outsied of the 100(1-alpha)% STB

col.pwb

(character) color for the point-wise STB (not adjusted for multiplicity), defaults to "#0000FF40" which is "blue" with 80% transparency

main

(characer) string for a main title appearing over the plot

add.pwb

(logical) should the point-wise tolerance band be plotted for comparison?

quiet

(logical) TRUE = no additional output ist printed (progress bar etc.)

add

(logical) TRUE = the 100(1-alpha)% STB is added to an existing plot

plot

(logical) TRUE = either a QQ-plot with STB (add=FALSE) or a STB added to an existing plot (add=TRUE) is plotted. FALSE = only computations are carried out without plotting, an object of class 'STB' is returned which can be stored an plotted later on, e.g. to avoid computing an STB every time a Sweave/mWeave report is updated

legend

(logical) TRUE a legend is plotted "topleft"

timer

(logical) TRUE = the time spent for computing the STB will be printed

pch

(integer) plotting symbols for the QQ-plot

pch.out

(integer) plotting symbols for outlying points

seed

(numeric) value interpreted as integer, setting the random number generator (RNG) to a defined state

...

further graphical parameters passed on

Details

Function takes a numeric vector 'vec' and computes the 100(1-alpha)%-simultaneous tolerance band (STB) for the (DEFAULT )Null-hypothesis H0: vec~N(my, sigma^2) distributed, which is equal to checking whether the residuals of the simplest linear model y = mu + e (y~1) are normally distributed, i.e. 'e ~ N(0, sigma^2)'. By specification of argument rand.func other null-distributions can be specified. One has to specify a function with a single argument 'n', which returns a random sample with 'n' observations, randomly drawn from the desired null-distribution (see description argument rand.func below). Note that all random samples as well as vector vec will be centered to mean=0 and scaled to sd=1.

One can choose between three methods for constructing the 100(1-alpha)% STB. There are two implementations of the quantile-based algorithm ("C", "R" see 1st reference) and one of the rank-based algorithm (see 2nd reference). Methods "C" and "R" can be run in parallel. The rank-based algorithm does not benefit form parallel processing, at least not in the current implementation. It is still the default and recommended for small to medium sized vectors and 10000 <= N <= 20000 simulations, which should be sufficiently accurate reflect the null-distribution. The "C" and "R" options refer to implementations of the quantile-based algorithm. The "C" implementation benefits most from using multiple cores, i.e. the larger 'Ncpu' the better, and should be used for large problems, i.e. rather large number of elements and large number of simulations.

The table below gives an impression how these algorithms perform. Runtimes were measured under Windows 7 on a Intel Xeon E5-2687W 3.1 GHz workstation with 16 logical cores and 16 GB RAM. The runtime of the C-implementation of the quantile-based algorithm is denoted as "t_qC12" applied parallely with 12 cores. Each setting was repeated 10 times and the overall run time was then divided by 10 providing sufficiently robust simulation results. Note, that for smaller problem sizes a large proportion of the overall runtime is due to simulating, i.e. drawing from the null-distribution.

_____N_obs _____N_sim ____t_rank ____t_qC12
25 5000 0.4s 0.5s
25 10000 0.8s 1.3s
50 10000 1.0s 3.2s
100 10000 1.7s 2.9s
100 20000 3.0s 4.8s
225 20000 5.1s 8.3s
300 30000 9.6s 17.2s
300 50000 16.1s 24.9s
1000 50000 47.8s 123.5s

Value

invisibly returns a list-type object of class STB, which comprises all arguments accepted by this function.

Author(s)

Andre Schuetzenmeister andre.schuetzenmeister@roche.com

References

Schuetzenmeister, A., Jensen, U., Piepho, H.P. (2011), Checking assumptions of normality and homoscedasticity in the general linear model. Communications in Statistics - Simulation and Computation; S. 141-154

Schuetzenmeister, A. and Piepho, H.P. (2012). Residual analysis of linear mixed models using a simulation approach. Computational Statistics and Data Analysis, 56, 1405-1416

See Also

plot.STB

Examples

### log-normal vector to be checked for normality
## Not run: 
set.seed(111)
stb(exp(rnorm(30)), col.out="red", legend=TRUE)

### uniformly distributed sample checked for Chi-Squared Distribution with DF=1, degrees of freedom
set.seed(707)
stb(runif(25, -5, 5), rand.func=function(n){rchisq(n=n, df=1)}, 
    col.out="red", legend=TRUE, main="Chi-Squared with DF=1")

### check whether an Chi-Squared (DF=1) random sample better fits 
stb(rchisq(25, df=1), rand.func=function(n){rchisq(n=n, df=1)}, 
    col.out="red", legend=TRUE, main="Chi-Squared with DF=1")

### add STB to an existing plot
plot(sort(rnorm(30)), sort(rnorm(30)))
stb(rnorm(30), add=TRUE)

### compute STB for later use and prevent plotting
STB <- stb(rnorm(30), plot=FALSE)

## End(Not run)

[Package STB version 0.6.5 Index]