estW {ICGE}R Documentation

INCA Statistic

Description

Assume that n units are divided into k clusters C1,...,Ck, and consider a fixed unit x0. Function estW calculates the INCA statistic W(x0) and the related U_i statistics.

Usage

estW(d, dx0, pert = "onegroup")

Arguments

d

a distance matrix or a dist object with distance information between units.

dx0

an n-vector containing the distances d0j between x0 and unit j.

pert

an n-vector that indicates which group each unit belongs to. Note that the expected values of pert are consecutive integers bigger or equal than 1 (for instance 1,2,3,4..., k). The default value indicates the presence of only one group in data.

Value

The function returns an object of class incaest which is a list containing the following components:

Wvalue

is the INCA statistic W(x_0).

Uvalue

is a vector containing the statistics U_i.

Note

For a correct geometrical interpretation it is convenient to verify whether the distance matrix d is Euclidean.

Author(s)

Itziar Irigoien itziar.irigoien@ehu.eus; Konputazio Zientziak eta Adimen Artifiziala, Euskal Herriko Unibertsitatea (UPV/EHU), Donostia, Spain.

Conchita Arenas carenas@ub.edu; Departament d'Estadistica, Universitat de Barcelona, Barcelona, Spain.

References

Arenas, C. and Cuadras, C.M. (2002). Some recent statistical methods based on distances. Contributions to Science, 2, 183–191.

Irigoien, I. and Arenas, C. (2008). INCA: New statistic for estimating the number of clusters and identifying atypical units. Statistics in Medicine, 27(15), 2948–2973.

See Also

vgeo, proxi , deltas

Examples

data(iris)
d <- dist(iris[,1:4])

# characteristics of a specific flower (likely group 1)
x0 <- c(5.3, 3.6, 1.1, 0.1) 
# distances between  flower x0 and the rest of flowers in iris
dx0 <- rep(0,150)
for (i in 1:150){
	dif <-x0-iris[i,1:4]
	dx0[i] <- sqrt(sum(dif*dif))
}
estW(d, dx0, iris[,5])


[Package ICGE version 0.4.2 Index]