disco {energy}R Documentation

distance components (DISCO)

Description

E-statistics DIStance COmponents and tests, analogous to variance components and anova.

Usage

disco(x, factors, distance, index=1.0, R, method=c("disco","discoB","discoF"))
disco.between(x, factors, distance, index=1.0, R)

Arguments

x

data matrix or distance matrix or dist object

factors

matrix or data frame of factor labels or integers (not design matrix)

distance

logical, TRUE if x is distance matrix

index

exponent on Euclidean distance in (0,2]

R

number of replicates for a permutation test

method

test statistic

Details

disco calculates the distance components decomposition of total dispersion and if R > 0 tests for significance using the test statistic disco "F" ratio (default method="disco"), or using the between component statistic (method="discoB"), each implemented by permutation test.

If x is a dist object, argument distance is ignored. If x is a distance matrix, set distance=TRUE.

In the current release disco computes the decomposition for one-way models only.

Value

When method="discoF", disco returns a list similar to the return value from anova.lm, and the print.disco method is provided to format the output into a similar table. Details:

disco returns a class disco object, which is a list containing

call

call

method

method

statistic

vector of observed statistics

p.value

vector of p-values

k

number of factors

N

number of observations

between

between-sample distance components

withins

one-way within-sample distance components

within

within-sample distance component

total

total dispersion

Df.trt

degrees of freedom for treatments

Df.e

degrees of freedom for error

index

index (exponent on distance)

factor.names

factor names

factor.levels

factor levels

sample.sizes

sample sizes

stats

matrix containing decomposition

When method="discoB", disco passes the arguments to disco.between, which returns a class htest object.

disco.between returns a class htest object, where the test statistic is the between-sample statistic (proportional to the numerator of the F ratio of the disco test.

Note

The current version does all calculations via matrix arithmetic and boot function. Support for more general additive models and a formula interface is under development.

disco methods have been added to the cluster distance summary function edist, and energy tests for equality of distribution (see eqdist.etest).

Author(s)

Maria L. Rizzo mrizzo@bgsu.edu and Gabor J. Szekely

References

M. L. Rizzo and G. J. Szekely (2010). DISCO Analysis: A Nonparametric Extension of Analysis of Variance, Annals of Applied Statistics, Vol. 4, No. 2, 1034-1055.
doi:10.1214/09-AOAS245

See Also

edist eqdist.e eqdist.etest ksample.e

Examples

      ## warpbreaks one-way decompositions
      data(warpbreaks)
      attach(warpbreaks)
      disco(breaks, factors=wool, R=99)
      
      ## warpbreaks two-way wool+tension
      disco(breaks, factors=data.frame(wool, tension), R=0)

      ## warpbreaks two-way wool*tension
      disco(breaks, factors=data.frame(wool, tension, wool:tension), R=0)

      ## When index=2 for univariate data, we get ANOVA decomposition
      disco(breaks, factors=tension, index=2.0, R=99)
      aov(breaks ~ tension)

      ## Multivariate response
      ## Example on producing plastic film from Krzanowski (1998, p. 381)
      tear <- c(6.5, 6.2, 5.8, 6.5, 6.5, 6.9, 7.2, 6.9, 6.1, 6.3,
                6.7, 6.6, 7.2, 7.1, 6.8, 7.1, 7.0, 7.2, 7.5, 7.6)
      gloss <- c(9.5, 9.9, 9.6, 9.6, 9.2, 9.1, 10.0, 9.9, 9.5, 9.4,
                 9.1, 9.3, 8.3, 8.4, 8.5, 9.2, 8.8, 9.7, 10.1, 9.2)
      opacity <- c(4.4, 6.4, 3.0, 4.1, 0.8, 5.7, 2.0, 3.9, 1.9, 5.7,
                   2.8, 4.1, 3.8, 1.6, 3.4, 8.4, 5.2, 6.9, 2.7, 1.9)
      Y <- cbind(tear, gloss, opacity)
      rate <- factor(gl(2,10), labels=c("Low", "High"))

	    ## test for equal distributions by rate
      disco(Y, factors=rate, R=99)
	    disco(Y, factors=rate, R=99, method="discoB")

      ## Just extract the decomposition table
      disco(Y, factors=rate, R=0)$stats

	    ## Compare eqdist.e methods for rate
	    ## disco between stat is half of original when sample sizes equal
	    eqdist.e(Y, sizes=c(10, 10), method="original")
	    eqdist.e(Y, sizes=c(10, 10), method="discoB")

      ## The between-sample distance component
      disco.between(Y, factors=rate, R=0)

[Package energy version 1.7-11 Index]