R: Computes the Maximum Distance

maximum.dist {StatMatch}

R Documentation

Computes the Maximum Distance

Description

This function computes the Maximum distance (or L^\infty norm) between units in a dataset or between observations in two distinct datasets.

Usage

maximum.dist(data.x, data.y=data.x, rank=FALSE)

Arguments

data.x

A matrix or a data frame containing variables that should be used in the computation of the distance. Only continuous variables are allowed. Missing values (NA) are not allowed.

When only data.x is supplied, the distances between rows of data.x are computed.

data.y

A numeric matrix or data frame with the same variables, of the same type, as those in data.x (only continuous variables are allowed). Dissimilarities between rows of data.x and rows of data.y will be computed. If not provided, by default it is assumed data.y=data.x and only dissimilarities between rows of data.x will be computed.

rank

Logical, when TRUE the original values are substituted by their ranks divided by the number of values plus one (following suggestion in Kovar et al. 1988). This rank transformation permits to remove the effect of different scales on the distance computation. When computing ranks the tied observations assume the average of their position (ties.method = "average" in calling the rank function).

Details

This function computes the L^\infty distance also know as minimax distance. In practice the distance between two records is the maximum of the absolute differences on the available variables:

d(i,j) = max \left( \left|x_{1i}-x_{1j} \right|, \left|x_{2i}-x_{2j} \right|,\ldots,\left|x_{Ki}-x_{Kj} \right| \right)

When rank=TRUE the original values are substituted by their ranks divided by the number of values plus one (following suggestion in Kovar et al. 1988).

Value

A matrix object with distances between rows of data.x and those of data.y.

Author(s)

Marcello D'Orazio mdo.statmatch@gmail.com

References

Kovar, J.G., MacMillan, J. and Whitridge, P. (1988). “Overview and strategy for the Generalized Edit and Imputation System”. Statistics Canada, Methodology Branch Working Paper No. BSMD 88-007 E/F.

Examples


md1 <- maximum.dist(iris[1:10,1:4])
md2 <- maximum.dist(iris[1:10,1:4], rank=TRUE)

md3 <- maximum.dist(data.x=iris[1:50,1:4], data.y=iris[51:100,1:4])
md4 <- maximum.dist(data.x=iris[1:50,1:4], data.y=iris[51:100,1:4], rank=TRUE)

[Package StatMatch version 1.4.2 Index]