R: Trimmed k-medoids algorithm

trimmedoid {Anthropometry}

R Documentation

Trimmed k-medoids algorithm

Description

This is the trimmed k-medoids algorithm. It is used within trimowa. It is analogous to k-medoids but a proportion alpha of observations is discarded by the own procedure (the trimmed observations are self-determined by the data). Furthermore, the trimmed k-medoids is analogous to trimmed k-means. An algorithm for computing trimmed k-means can be found in Garcia-Escudero et al. (2003). See Ibanez et al. (2012) for more details. Note that in the generic name of the k-medoids algorithm, k refers to the number of clusters to search for. To be more specific in the R code, k is referred to as numClust, see next section arguments.

Usage

trimmedoid(D,numClust,alpha,niter,algSteps=7,verbose)

Arguments

`D`	Dissimilarity matrix.
`numClust`	Number of clusters.
`alpha`	Proportion of trimmed sample.
`niter`	Number of random initializations (iterations).
`algSteps`	Number of steps of the algorithm per initialization. Default value is 7.
`verbose`	A logical specifying whether to provide descriptive output about the running process.

Value

A list with the following elements:

vopt: The objective value.

copt: The trimmed medoids.

asig: The assignation of each observation (asig=0 indicates trimmed individuals).

ch: The goodness index.

Dmod: Modified data with the non-trimmed women.

qq: Vector with the non-trimmed points.

Author(s)

Irene Epifanio

References

Ibanez, M. V., Vinue, G., Alemany, S., Simo, A., Epifanio, I., Domingo, J., and Ayala, G., (2012). Apparel sizing using trimmed PAM and OWA operators, Expert Systems with Applications 39, 10512–10520.

Garcia-Escudero, L. A., Gordaliza, A., and Matran, C., (2003). Trimming tools in exploratory data analysis, Journal of Computational and Graphical Statistics 12(2), 434–449.

Garcia-Escudero, L. A., and Gordaliza, A., (1999). Robustness properties of k-means and trimmed k-means, Journal of the American Statistical Association 94(447), 956–969.

Examples

#Data loading:
dataTrimowa <- sampleSpanishSurvey
bust <- dataTrimowa$bust
#First bust class:
data <- dataTrimowa[(bust >= 74) & (bust < 78), ]   
numVar <- dim(dataTrimowa)[2]

#Weights calculation:
orness <- 0.7
weightsTrimowa <- weightsMixtureUB(orness,numVar)

#Constants required to specify the distance function:
numClust <- 3
bh <- (apply(as.matrix(log(data)),2,range)[2,] 
       - apply(as.matrix(log(data)),2,range)[1,]) / ((numClust-1) * 8) 
bl <- -3 * bh
ah <- c(23,28,20,25,25)
al <- 3 * ah

#Data processing.
num.persons <- dim(data)[1]
num.variables <- dim(data)[2]
datam <- as.matrix(data)
datat <- aperm(datam, c(2,1))                     
dim(datat) <- c(1,num.persons * num.variables)   

#Dissimilarity matrix:
D <- getDistMatrix(datat, num.persons, numVar, weightsTrimowa, bl, bh, al, ah, FALSE)

res_trimm <- trimmedoid(D, numClust, 0.01, 6, 7, FALSE)

[Package Anthropometry version 1.19 Index]