matching {kmed} | R Documentation |
A pair distance for binary/ categorical variables
Description
This function computes the simple matching distance from two data frames/ matrices.
Usage
matching(x, y)
Arguments
x |
A first data frame or matrix (see Details). |
y |
A second data frame or matrix (see Details). |
Details
The x
and y
arguments have to be data frames/
matrices with the same number of columns where the row indicates the object
and the column is the variable. This function calculates all pairwise
distance between rows in the x
and y
data frames/ matrices.
If the x
data frame/ matrix is equal to the y
data frame/
matrix, it explicitly calculates all distances in the x
data frame/
matrix.
The simple matching distance between objects i and j is calculated by
d_{ij} = \frac{\sum_{s=1}^{P}(x_{is}-x_{js})}{P}
where P
is the number of variables, and x_{is}-x_{js} \in
{0, 1}. x_{is}-x_{js} = 0
, if x_{is}=x_{js}
and
x_{is}-x_{js} = 1
, when x_{is} \neq x_{js}
.
As an example, the distance between objects 1 and 2 is presented.
object | x | y | z |
1 | 1 | 2 | 2 |
2 | 1 | 2 | 1 |
The distance between objects 1 and 2 is
d_{12} = \frac{\sum_{s=1}^{3}(x_{is}-x_{js})}{3} = \frac{0 + 0 + 1}{3} =
\frac{1}{3} = 0.33
Value
Function returns a distance matrix with the number of rows equal to
the number of objects in the x
data frame/ matrix (n_x
) and
the number of columns equals to the number of objects in the y
data frame/ matrix (n_y
).
Author(s)
Weksi Budiaji
Contact: budiaji@untirta.ac.id
Examples
set.seed(1)
a <- matrix(sample(1:2, 7*3, replace = TRUE), 7, 3)
matching(a, a)