dissimilarity {relations} | R Documentation |
Dissimilarity Between Relations
Description
Compute the dissimilarity between (ensembles of) relations.
Usage
relation_dissimilarity(x, y = NULL, method = "symdiff", ...)
Arguments
x |
an ensemble of relations (see
|
y |
|
method |
a character string specifying one of the built-in
methods for computing dissimilarity, or a function to be taken as
a user-defined method. If a character string, its lower-cased
version is matched against the lower-cased names of the available
built-in methods using |
... |
further arguments to be passed to methods. |
Details
Available built-in methods are as follows.
"symdiff"
symmetric difference distance. This computes the cardinality of the symmetric difference of two relations, i.e., the number of tuples contained in exactly one of two relations. For preference relations, this coincides with the Kemeny-Snell metric (Kemeny and Snell, 1962). For linear orders, it gives Kendall's
\tau
metric (Diaconis, 1988).Can also be referred to as
"SD"
.Only applicable to crisp relations.
"manhattan"
the Manhattan distance between the incidences.
"euclidean"
the Euclidean distance between the incidences.
"CS"
Cook-Seiford distance, a generalization of the distance function of Cook and Seiford (1978). Let the generalized ranks of an object
a
in the (first) domain of an endorelationR
be defined as the number of objectsb
dominatinga
(i.e., for whicha R b
and notb R a
), plus half the number of objectsb
equivalent toa
(i.e., for whicha R b
andb R a
). For preference relations, this gives the usual Kendall ranks arranged according to decreasing preference (and averaged for ties). Then the generalized Cook-Seiford distance is defined as thel_1
distance between the generalized ranks. For linear orders, this gives Spearman's footrule metric (Diaconis, 1988).Only applicable to crisp endorelations.
"CKS"
Cook-Kress-Seiford distance, a generalization of the distance function of Cook, Kress and Seiford (1986). For each pair of objects
a
andb
in an endorelationR
, we can havea R b
and notb R a
or vice versa (cases of “strict preference”),a R b
andb R a
(the case of “indifference”), or neithera R b
norb R a
(the case of “incomparability”). (Only the last two are possible ifa = b
.) The distance by Cook, Kress and Seiford puts indifference as the metric centroid between both preference cases and incomparability (i.e., indifference is at distance one from the other three, and each of the other three is at distance two from the others). The generalized Cook-Kress-Seiford distance is the paired comparison distance (i.e., a metric) based on these distances between the four paired comparison cases. (Formula 3 in the reference must be slightly modified for the generalization from partial rankings to arbitrary endorelations.)Only applicable to crisp endorelations.
"score"
score-based distance. This computes
\Delta(s(x), s(y))
for suitable score and distance functionss
and\Delta
, respectively. These can be specified by additional argumentsscore
andDelta
. Ifscore
is a character string, it is taken as the method forrelation_scores
. Otherwise, if given it must be a function giving the score function itself. IfDelta
is a numberp \ge 1
, the usuall_p
distance is used. Otherwise, it must be a function giving the distance function. The defaults correspond to using the default relation scores andp = 1
, which for linear orders gives Spearman's footrule distance.Only applicable to endorelations.
"Jaccard"
Jaccard distance: 1 minus the ratio of the cardinalities of the intersection and the union of the relations.
"PC"
(generalized) paired comparison distance. This generalizes the symdiff and CKS distances to use a general set of discrepancies
\delta_{kl}
between the possible paired comparison results witha,b
/b,a
incidences 0/0, 1/0, 0/1, and 1/1 numbered from 1 to 4 (in a preference context with a\le
encoding, these correspond to incompatibility, strict<
and>
preference, and indifference), with\delta_{kl}
the discrepancy between possible resultsk
andl
. The distance is then obtained as the sum of the discrepancies from the paired comparisons of distinct objects, plus half the sum of discrepancies from the comparisons of identical objects (for which the only possible results are incomparability and indifference). The distance is a metric provided that the\delta_{kl}
satisfy the metric conditions (non-negativity and zero iffk = l
, symmetry and sub-additivity).The discrepancies can be specified via the additional argument
delta
, either as a numeric vector of length 6 with the non-redundant values\delta_{21}, \delta_{31}, \delta_{41}, \delta_{32}, \delta_{42}, \delta_{43}
, or as a character string partially matching one of the following built-in discrepancies with corresponding parameter vector\delta
:"symdiff"
symmetric difference distance, with discrepancy between distinct results two between either opposite strict preferences or indifference and incomparability, and one otherwise:
\delta = (1, 1, 2, 2, 1, 1)
(default).Can also be referred to as
"SD"
."CKS"
Cook-Kress-Seiford distance, see above:
\delta = (2, 2, 1, 2, 1, 1)
."EM"
the distance obtained from the generalization of the Kemeny-Snell distance for complete rankings to partial rankings introduced in Emond and Mason (2000). This uses a discrepancy of two for opposite strict preferences, and one for all other distinct results:
\delta = (1, 1, 1, 2, 1, 1)
."JMB"
the distance with parameters as suggested by Jabeur, Martel and Ben Khélifa (2004):
\delta = (4/3, 4/3, 4/3, 5/3, 1, 1)
."discrete"
the discrete metric on the set of paired comparison results:
\delta = (1, 1, 1, 1, 1, 1)
.
Only applicable to crisp endorelations.
Methods "symdiff"
, "manhattan"
, "euclidean"
and
"Jaccard"
take an additional logical argument na.rm
: if
true (default: false), tuples with missing memberships are excluded in
the dissimilarity computations.
Value
If y
is NULL
, an object of class dist
containing the dissimilarities between all pairs of elements of
x
. Otherwise, a matrix with the dissimilarities between the
elements of x
and the elements of y
.
References
W. D. Cook, M. Kress and L. M. Seiford (1986), Information and preference in partial orders: a bimatrix representation. Psychometrika 51/2, 197–207. doi:10.1007/BF02293980.
W. D. Cook and L. M. Seiford (1978), Priority ranking and consensus formation. Management Science, 24/16, 1721–1732. doi:10.1287/mnsc.24.16.1721.
P. Diaconis (1988), Group Representations in Probability and Statistics. Institute of Mathematical Statistics: Hayward, CA.
E. J. Emond and D. W. Mason (2000), A new technique for high level decision support. Technical Report ORD Project Report PR2000/13, Operational Research Division, Department of National Defence, Canada.
K. Jabeur, J.-M. Martel and S. Ben Khélifa (2004). A distance-based collective preorder integrating the relative importance of the groups members. Group Decision and Negotiation, 13, 327–349. doi:10.1023/B:GRUP.0000042894.00775.75.
J. G. Kemeny and J. L. Snell (1962), Mathematical Models in the Social Sciences, chapter “Preference Rankings: An Axiomatic Approach”. MIT Press: Cambridge.