brokenStick {PCDimension} | R Documentation |
The Broken Stick Method
Description
The Broken Stick model is one proposed method for estimating the number of statistically significant principal components.
Usage
brokenStick(k, n)
bsDimension(lambda, FUZZ = 0.005)
Arguments
k |
An integer between 1 and |
n |
An integer; the total number of principal components. |
lambda |
The set of variances from each component from a principal
components analysis. These are assumed to be already sorted in
decreasing order. You can also supply a |
FUZZ |
A real number; anything smaller than |
Details
The Broken Stick model is one proposed method for estimating the
number of statistically significant principal components. The idea is
to model N
variances by taking a stick of unit length and breaking it
into N
pieces by randomly (and simultaneously) selecting break
points from a uniform distribution.
Value
The brokenStick
function returns, as a real number, the
expected value of the k
-th longest piece when breaking a
stick of length one into n
total pieces. Most commonly used
via the idiom brokenStick(1:N, N)
to get the entire vector of
lengths at one time.
The bsDimension
function returns an integer, the number of
significant components under this model. This is computed by finding
the last point at which the observed variance is bugger than the
expected value under the broken stick model by at least FUZZ
.
Author(s)
Kevin R. Coombes <krc@silicovore.com>
References
Jackson, D. A. (1993). Stopping rules in principal components analysis: a comparison of heuristical and statistical approaches. Ecology 74, 2204–2214.
Legendre, P. and Legendre, L. (1998) Numerical Ecology. 2nd English ed. Elsevier.
See Also
Better methods to address this question are based on the Auer-Gervini
method; see AuerGervini
.
Examples
brokenStick(1:10, 10)
sum( brokenStick(1:10, 10) )
fakeVar <- c(30, 20, 8, 4, 3, 2, 1)
bsDimension(fakeVar)