TwoGroupStats-class {ClassComparison}R Documentation

Class "TwoGroupStats"

Description

Compute row-by-row means and variances for a data matrix whose columns belong to two different groups of interest.

Usage

TwoGroupStats(data, classes, name=comparison, name1=A, name2=B)
## S4 method for signature 'TwoGroupStats'
as.data.frame(x, row.names=NULL, optional=FALSE)
## S4 method for signature 'TwoGroupStats'
summary(object, ...)
## S4 method for signature 'TwoGroupStats'
print(x, ...)
## S4 method for signature 'TwoGroupStats'
show(object)
## S4 method for signature 'TwoGroupStats,missing'
plot(x, main=x@name, useLog=FALSE, ...)

Arguments

data

Either a data frame or matrix with numeric values or an ExpressionSet as defined in the BioConductor tools for analyzing microarray data.

classes

If data is a data frame or matrix, then classes must be either a logical vector or a factor. If data is an ExpressionSet, then classes can be a character string that names one of the factor columns in the associated phenoData subobject.

name

A character string; the name of this object

name1

A character string; the name of the first group

name2

A character string; the name of the second group

x

A TwoGroupStats object

row.names

See the base version of as.data.frame.default

optional

See the base version of as.data.frame.default

object

A TwoGroupStats object

main

Plot title

useLog

a logical flag; should the values be log-transformed before plotting?

...

The usual extra arguments to generic functions

Details

This class was one of the earliest developments in our suite of tools to analyze microarrays. Its main purpose is to segregate out the preliminary computation of summary statistics on a row-by-row basis, along with a set of plots that could be generated automatically and used for quality control.

Creating Objects

Although objects of the class can be created by a direct call to new, the preferred method is to use the TwoGroupStats generator. The inputs to this function are the same as those used for row-by-row statistical tests throughout the ClassComparison package; a detailed description can be found in the MultiTtest class.

One should note that this class serves as the front end to the SmoothTtest class, providing it with an interface that accepts ExpressionSet objects compatible with the other statistical tests in the ClassComparison package.

Slots

mean1:

numeric vector of means in the first group

mean2:

numeric vector of means in the second group

overallMean:

numeric vector of overall row means

var1:

numeric vector of variances in the first group

var2:

numeric vector of variances in the second group

overallVar:

numeric vector of variances assuming the two groups have the same mean

pooledVar:

numeric vector of row-by-row pooled variances, assuming the two groups have the same variance but different means

n1:

numeric scalar specifying number of items in the first group

n2:

numeric scalar specifying number of items in the second group

name1:

character string specifying name of the first group

name2:

character string specifying name of the second group

name:

character string specifying name of the object

Methods

as.data.frame(x, row.names=NULL, optional=FALSE)

Collect the numeric vectors from the object into a single dat fame, suitable for printing or exporting.

summary(object, ...)

Write out a summary of the object.

print(x, ...)

Print the object. (Actually, it only prints a summary, since the whole object is almost always more than you really want to see. If you insist on printing everything, use as.data.frame.)

show(object)

Print the object (same as print method).)

plot(x, main=x@name, useLog=FALSE, ...)

This function actually produces six different plots of the data, so it is usually wrapped by a graphical layout command like par(mfrow=c(2,3)). The first two plots show the relation between the mean and standard deviation for the two groups separately; the third plot does the same for the overall mean and variance. The fourth plot is a Bland-Altman plot of the difference between the means against the overall mean. (In the microarray world, this is usually called an M-vs-A plot.) A loess fit is overlaid on the scatter plot, and points outside confidence bounds based on the fit are printed in a different color to flag them as highly variable. The fifth plot shows a loess fit (with confidence bounds) of the difference as a function of the row index (which often is related to the geometric position of spots on a microarray). Thus, this plot gives a possible indication of regions of an array where unusual things happen. The final plot compares the overall variances to the pooled variances.

Author(s)

Kevin R. Coombes krc@silicovore.com

References

Altman DG, Bland JM.
Measurement in Medicine: the Analysis of Method Comparison Studies.
The Statistician, 1983; 32: 307-317.

See Also

MultiTtest, SmoothTtest

Examples

showClass("TwoGroupStats")
bogus <- matrix(rnorm(30*1000, 8, 3), ncol=30, nrow=1000)
splitter <- rep(FALSE, 30)
splitter[16:30] <- TRUE

x <- TwoGroupStats(bogus, splitter)
summary(x)

opar<-par(mfrow=c(2,3), pch='.')
plot(x)
par(opar)

[Package ClassComparison version 3.1.8 Index]