disscenter {TraMineR} | R Documentation |
Compute distances to the center of a group
Description
Computes the dissimilarity between objects and their group center from their pairwise dissimilarity matrix.
Usage
disscenter(diss, group=NULL, medoids.index=NULL,
allcenter = FALSE, weights=NULL, squared=FALSE)
Arguments
diss |
a dissimilarity matrix such as generated by |
group |
if |
medoids.index |
if |
allcenter |
logical. If |
weights |
optional numerical vector containing weights. |
squared |
Logical. If |
Details
This function computes the dissimilarity between given objects and their group center. It is possible that the group center does not belong to the space formed by the objects (in the same way as the average of integer numbers is not necessarily an integer itself).
This distance can also be understood as the contribution to the discrepancy (see dissvar
).
Note that when the dissimilarity measure does not respect the triangle inequality, the dissimilarity between a given object and its group center may be negative
It can be shown that this dissimilarity is equal to (see Batagelj 1988):
d_{x\tilde{g}}=\frac{1}{n}\big(\sum_{i=1}^{n}d_{xi}-SS\big)
where SS
is the sum of squares (see dissvar
).
Value
A vector with the dissimilarity to the group center for each object, or a list of medoid indexes.
Author(s)
Matthias Studer (with Gilbert Ritschard for the help page)
References
Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2011). Discrepancy analysis of state sequences, Sociological Methods and Research, Vol. 40(3), 471-510, doi:10.1177/0049124111415372.
Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2010) Discrepancy analysis of complex objects using dissimilarities. In F. Guillet, G. Ritschard, D. A. Zighed and H. Briand (Eds.), Advances in Knowledge Discovery and Management, Studies in Computational Intelligence, Volume 292, pp. 3-19. Berlin: Springer.
Studer, M., G. Ritschard, A. Gabadinho and N. S. Müller (2009) Analyse de dissimilarités par arbre d'induction. In EGC 2009, Revue des Nouvelles Technologies de l'Information, Vol. E-15, pp. 7–18.
Batagelj, V. (1988) Generalized ward and related clustering problems. In H. Bock (Ed.), Classification and related methods of data analysis, Amsterdam: North-Holland, pp. 67–74.
See Also
dissvar
to compute the pseudo variance from dissimilarities and for a basic introduction to concepts of pseudo variance analysis
dissassoc
to test association between objects represented by their dissimilarities and a covariate.
disstree
for an induction tree analyse of objects characterized by a dissimilarity matrix.
dissmfacw
to perform multi-factor analysis of variance from pairwise dissimilarities.
Examples
## Defining a state sequence object
data(mvad)
mvad.seq <- seqdef(mvad[, 17:86])
## Building dissimilarities (any dissimilarity measure can be used)
mvad.ham <- seqdist(mvad.seq, method="HAM")
## Compute distance to center according to group gcse5eq
dc <- disscenter(mvad.ham, group=mvad$gcse5eq)
## Ploting distribution of dissimilarity to center
boxplot(dc~mvad$gcse5eq, col="cyan")
## Retrieving index of the first medoids, one per group
dc <- disscenter(mvad.ham, group=mvad$Grammar, medoids.index="first")
print(dc)
## Retrieving index of all medoids in each group
dc <- disscenter(mvad.ham, group=mvad$Grammar, medoids.index="all")
print(dc)