R: Block Coordinate Ascent Method for General (Balanced or...

match.bca.gen {matchFeat}

R Documentation

Block Coordinate Ascent Method for General (Balanced or Unbalanced) Data

Description

Solve a feature matching problem by block coordinate ascent

Usage

match.bca.gen(x, unit = NULL, cluster = NULL, w = NULL, 
	method = c("cyclical", "random"), control = list())

Arguments

`x`	data matrix (rows=instances, columns=features)
`unit`	vector of unit labels (length = number of rows of `x`)
`cluster`	integer specifying the number of classes/clusters to assign the feature vectors to OR integer vector specifiying the initial cluster assignment.
`w`	feature weights in loss function. Can be specified as single positive number, vector, or positive definite matrix
`method`	sweeping method for block coordinate ascent: `cyclical` or `random` (simple random sampling without replacement)
`control`	optional list of tuning parameters

Details

If cluster is an integer vector, it must have the same length as unit and its values must range between 1 and the number of clusters.

The list control can contain a field maxit, an integer that fixes the maximum number of algorithm iterations.

Value

A list of class matchFeat with components

cluster: integer vector of cluster assignments (length = now(x))
objective: minimum objective value
mu: sample mean for each cluster/class (feature-by-cluster matrix)
V: sample covariance for each cluster/class (feature-by-feature-by-cluster 3D array)
size: integer vector of cluster sizes
call: function call

References

Degras (2022) "Scalable feature matching across large data collections." doi:10.1080/10618600.2022.2074429
Wright (2015). Coordinate descent algorithms. https://arxiv.org/abs/1502.04759

Examples

data(optdigits)
nobs <- nrow(optdigits$x) # total number of observations
n <- length(unique(optdigits$unit)) # number of statistical units
rmv <- sample.int(nobs, n-1) # remove (n-1) observations to make data unbalanced
min.m <- max(table(optdigits$unit[-rmv])) # smallest possible number of clusters
# lower values will result in an error message 
m <- min.m
result <- match.bca.gen(optdigits$x[-rmv,], optdigits$unit[-rmv], m)

[Package matchFeat version 1.0 Index]