R: Regular matching with minimum-cost spanning subgraphs

rRegMatch {AcrossTic}

R Documentation

Regular matching with minimum-cost spanning subgraphs

Description

This function matches each observation in X to r others so as to minimize the total distance across all matches. Optionally it computes the cross-count statistic – the number of matches associated with two observations from different classes.

Usage

rRegMatch(X, r, y = NULL, dister = "daisy", dist.args = list(), keep.X = nrow(X) < 100, 
    keep.D = (dister == "treeClust.dist"), relax = (N >= 100), thresh = 1e-6)

Arguments

`X`	Matrix or data frame of data, or inter-point distances represented in an object inheriting from "dist"
`r`	Integer number of matches. The matching is "regular" in that every observation is matched to exactly r others (or, if relax=TRUE, every observation is matched to others with weights in [0, 1] that add up to r).
`y`	Vector of class membership indices. This is used to compute the cross-count statistic. Optional.
`dister`	Function to compute inter-point distances. This must take as its first argument a matrix of data argument name `x`. Default: `daisy`. If all the columns are numeric, this produces unweighted Euclidean distance by default.
`dist.args`	List of argument to the `dister` function.
`keep.X`	If TRUE, and X was supplied, keep the X matrix in the output object. Default: TRUE if X was supplied and also nrow (X) < 100.
`keep.D`	If TRUE, keep the distance object in the output. Default: TRUE if the `treeClust.dist` function is being used to compute the distances (since in that case the distances are random).
`relax`	If FALSE, solve the exact problem where each observation gets exactly r non-zero pairings, each with weight 1. If TRUE, solve the relaxed problem, where each observation has at least r non-zero pairings, each with its own weight between 0 and 1, the weights adding up to r. The exact problem gets very slow with large samples.
`thresh`	Weights smaller than this are considered to be exactly zero. Default: 1e-6.

Details

This function solves an optimization problem to extract the set of pairings which make the total weight (distance) associated with all pairings a minimum, subject to the constraint that every observation is paired to r others (or to enough others to have a total pair-weight of r).

Value

A list of class AcrossTic, with elements:

`matches`	A two-column matrix, each row gving the indices of one matched pair.
`total.dist`	total distance across all matches – the optimal value from the optimization problem.
`status`	Status of result – if the optimum was found, a vector of length 1 with name "TM_OPTIMAL_SOLUTION_FOUND" and value 0.
`time.required`	Time taken to run the optimization, as reported by `system.time()`.
`call`	The call made to the function, from `match.call`.
`r`	The value of r, as supplied at the time of the call.
`dister`	The value of dister, as supplied at the time of the call.
`dist.args`	The value of dist.args, as supplied at the time of the call.
`X.supplied`	Logical indicating whether X was supplied.
`X`	X matrix, if it was available and asked to be kept
`y`	y vector, as supplied
`edge.weights`	vector, of length `nrow(matches)`, giving the distances for each match. For the exact problem (`relax = FALSE`), each value is equal to 0 or 1. For the relaxed problem (`relax = TRUE`), each value is between 0 and 1, with values summing to `(r * nrow(X) / 2)`.
`cross.sum`	Sum of matcher.costs across all matches
`cross.count`	Number of matches between two observations of different classes, possibly weighted
`nrow.X`, `ncol.X`	dimension of X matrix

Author(s)

David Ruth and Sam Buttrey

References

David Ruth, "A new multivariate two-sample test using regular minimum-weight spanning subgraphs," J. Stat. Distributions and Applications (2014)

Examples

set.seed (123)
X <- matrix (rnorm (100), 50, 2) # Create data...
y <- rep (c (1, 2), each=25) # ...and class membership
rRegMatch (X, r = 3, y = y)
## Not run: plot (rRegMatch (X, r = 3, y = y)) # to see picture

[Package AcrossTic version 1.0-3 Index]