R: Biplot of Multivariate Data Based on Principal Components...

bpca {bpca}

R Documentation

Biplot of Multivariate Data Based on Principal Components Analysis

Description

Computes biplot reduction on data.frame, matrix or prcomp objects and returns a bpca object.

Usage

  bpca(x, ...)
  ## Default S3 method:
bpca(x,
     d=1:2,
     center=2,
     scale=TRUE,
     method=c('hj', 'sqrt', 'jk', 'gh'),
     iec=FALSE,
     var.rb=FALSE,
     var.rd=FALSE,
     limit=10, ...)
  ## S3 method for class 'prcomp'
bpca(x,
     d=1:2, ...)

Arguments

`x`	A `data.frame`, `matrix` or `prcomp` object.
`d`	A vector giving the first and last eigenvalue to be considered by the biplot reduction. It can be `d=1:3` or `d=c(1,3)` for 3d biplot. The default is `d=1:2`.
`center`	Numeric. The type of centering to be performed: ‘⁠0⁠’ - no centering ‘⁠1⁠’ - global-centered = `sweep(x, 1, mean(x))` ‘⁠2⁠’ - column-centered = `sweep(x, 2, apply(x, 2, mean))` ‘⁠3⁠’ - double-centered = `sweep(sweep(x, 1, apply(x, 1, mean)), 2, apply(x, 2, mean)) + mean(x)` The default is 2.
`scale`	Logical. A value indicating whether the variables should be scaled to have unit variance before the analysis takes place: `FALSE` - no scale; `TRUE` - scale.
`method`	A vector of character strings that indicates the method of factorization: ‘⁠hj⁠’ - ‘⁠HJ⁠’ (‘⁠simetric⁠’, Galindo Villardón (1986)); ‘⁠sqrt⁠’ - ‘⁠SQRT⁠’ (‘⁠squared root - simetric⁠’, Gabriel (1971)); ‘⁠jk⁠’ - ‘⁠JK⁠’ (‘⁠row metric preserving⁠’, Gabriel (1971)); ‘⁠gh⁠’ - ‘⁠GH⁠’ (‘⁠column metric preserving⁠’, Gabriel (1971)).
`iec`	Logical. If `TRUE` the matrix of eigenvalues, coordinates od objects and variables will be inverted. The default is `FALSE`.
`var.rb`	A logical value. If `TRUE`, all correlation coefficients for all variables (under the biplot projection) will be computed.
`var.rd`	A logical value. If `TRUE`, the diagnostic of the representation of variables projected by the biplot will be computed. If `var.rd` is `TRUE` the `var.rb` parameter must be also `TRUE`.
`limit`	A vector giving the percentual limit to define poor representation of variables.
`...`	Additional parameters. It is necessary to be S3 method.

Details

The biplot is a multivariate method for graphing row and column elements using a single plot (Gabriel, 1971).

The biplot of a matrix

_{n}Y_{p}

projects on the same plot: rows (associated with n objects) and columns (associated with p variables), markers from data that forms a two-way table (data.frame or matrix object). The markers are computed from the singular value decomposition, svd(Y), and subsequent factorization.

The bi refers to the kind of information contained in a data set disposed in a two-way table. If the data are a tri-dimensional array the method will be called triplot (not still contemplated in the bpca package).

The basic idea behind the biplot method was to add the information about the variables to the principal component graph (Johnson & Wichern, 1988).

Considering the results of

svd(_{n}Y_{p})

d: A vector containing the singular values of Y, of length min(n, p);
u: A matrix whose columns contain the left singular vectors of Y, present if ‘⁠nu > 0⁠’. Dimension ‘⁠c(n, nu)⁠’;
v: A matrix whose columns contain the right singular vectors of Y, present if ‘⁠nv > 0⁠’. Dimension c(p, nv).

and also,

s^2 = diag(d)

n = n\_objects(Y)

it is possible an approximation of Y:

_{n}Y_{p} \approx Y_{m} = g.h'

in various ways. The methods of factorization computed by the bpca function are:

HJ - simetric, Galindo Villardón (1986):

g = u*s^2

h = s^2*v'
SQRT - squared root simetric, Gabriel (1971):

g = u*\sqrt{s^2}

h = \sqrt{s^2}*v'
JK - row metric preserving, Gabriel (1971):

g = u*s^2

h = v'
GH - column metric preserving, Gabriel (1971):

g = \sqrt{n-1}*u

h = \frac{1}{\sqrt{n-1}}*s^2*v'

Considering

_{n}Y_{p} \approx Y_{m}

it is possible to deduce that if the rank (r) of the matrix

_{n}Y_{p}

is bigger than ‘m’, the biplot representation of Y will be an approximation, and accurate only in the case of $r=m$.

Due to the need of different methods of factorization, if ‘⁠x⁠’ is a prcomp object, the method bpca.prcomp will go back from the prcomp function. In other words, it will regenerates, or computes, the inverse of the svd decomposition of the given data

_{n}Y_{p}

After this, it will make a call to the method bpca.default with the adequate parameters.

The biplot is used with many multivariate methods to display relationships between objects, variables and the interrelationship between objects and variables (as prevalence, importance). There are many variations of biplots (see the references).

Value

The function bpca returns an object of class bpca.2d or bpca.3d. Both are list objects with the slots:

`call`	The call used.
`eigenvalues`	A vector of the eigenvalues.
`eigenvectors`	A vector of the eigenvectors.
`numer`	A vector of the number of eigenvalues considered in the reduction.
`importance`	A matrix with the general and partial variation explained by the reduction.
`coord`	A list with the coordinates of the two components: objects and variables.
`var.rb`	A matrix of all correlation coefficients for all variables under the biplot projection.
`var.rd`	A matrix of the diagnostic of the poor projection of variable correlations by the biplot reduction.

Author(s)

Faria, J. C.
Allaman, I. B.
Demétrio C. G. B.

References

Gabriel, K. R. (1971) The biplot graphical display of matrices with application to principal component analysis. Biometrika 58, 453-467.

Galindo Vilardón, M. P. (1986) Una alternativa de representación simultánea: HJ-Biplot. Qüestiió, 10(1):13-23, 1986.

Johnson, R. A. and Wichern, D. W. (1988) Applied multivariate statistical analysis. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 6 ed.

Gower, J.C. and Hand, D. J. (1996) Biplots. Chapman & Hall.

Yan, B. W. and Kang, M. S. (2003) GGE biplot analysis: a graphical tool for breeders, geneticists, and agronomists. CRC Press, New York, 288p.

Examples

##
## Example 1
## Computing and ploting a bpca object with 'graphics' package - 2d
##

bp <- bpca(gabriel1971)

dev.new(w=6, h=6)
oask <- devAskNewPage(dev.interactive(orNone=TRUE))
plot(bp,
     var.factor=2)

# Exploring the object 'bp' created by the function 'bpca'
class(bp)
names(bp)
str(bp)

summary(bp)
bp$call
bp$eigenval
bp$eigenvec
bp$numb
bp$import
bp$coord
bp$coord$obj
bp$coord$var
bp$var.rb
bp$var.rd

## Not run: 
##
## Example 2
## Computing and plotting a bpca object with 'scatterplot3d' package - 3d
##

bp <- bpca(gabriel1971,
           d=2:4)

plot(bp,
     var.factor=3,
     xlim=c(-2,2),
     ylim=c(-2,2),
     zlim=c(-2,2))

# Exploring the object 'bp' created by the function 'bpca'
class(bp)
names(bp)
str(bp)

summary(bp)
bp$call
bp$eigenval
bp$eigenvec
bp$numb
bp$import
bp$coord
bp$coord$obj
bp$coord$var
bp$var.rb
bp$var.rd

##
## Example 3
## Computing and plotting a bpca object with 'rgl' package - 3d
##

plot(bpca(gabriel1971,
          d=1:3),
     rgl.use=TRUE,
     var.factor=2)

# Suggestion: Interact with the graphic with the mouse
# left button: press, maintain and movement it to interactive rotation;
# right button: press, maintain and movement it to interactive zoom.
# Enjoy it!

##
## Example 4
## Grouping objects with different symbols and colors - 2d and 3d
##

# 2d
plot(bpca(iris[-5]),
     var.factor=.3,
     var.cex=.7,
     obj.names=FALSE,
     obj.cex=1.5,
     obj.col=c('red', 'green3', 'blue')[unclass(iris$Species)],
     obj.pch=c('+', '*', '-')[unclass(iris$Species)])

# 3d static
plot(bpca(iris[-5],
          d=1:3),
     var.factor=.2,
     var.color=c('blue', 'red'),
     var.cex=1,
     obj.names=FALSE,
     obj.cex=1,
     obj.col=c('red', 'green3', 'blue')[unclass(iris$Species)],
     obj.pch=c('+', '*', '-')[unclass(iris$Species)])

# 3d dynamic
plot(bpca(iris[-5],
          method='hj',
          d=1:3),
     rgl.use=TRUE,
     var.col='brown',
     var.factor=.3,
     var.cex=1.2,
     obj.names=FALSE,
     obj.cex=.8,
     obj.col=c('red', 'green3', 'orange')[unclass(iris$Species)],
     simple.axes=FALSE,
     box=TRUE)

## End(Not run)

devAskNewPage(oask)

[Package bpca version 1.3-6 Index]

Biplot of Multivariate Data Based on Principal Components Analysis

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples