match.bca.gen {matchFeat}R Documentation

Block Coordinate Ascent Method for General (Balanced or Unbalanced) Data

Description

Solve a feature matching problem by block coordinate ascent

Usage

match.bca.gen(x, unit = NULL, cluster = NULL, w = NULL, 
	method = c("cyclical", "random"), control = list())

Arguments

x

data matrix (rows=instances, columns=features)

unit

vector of unit labels (length = number of rows of x)

cluster

integer specifying the number of classes/clusters to assign the feature vectors to OR integer vector specifiying the initial cluster assignment.

w

feature weights in loss function. Can be specified as single positive number, vector, or positive definite matrix

method

sweeping method for block coordinate ascent: cyclical or random (simple random sampling without replacement)

control

optional list of tuning parameters

Details

If cluster is an integer vector, it must have the same length as unit and its values must range between 1 and the number of clusters.

The list control can contain a field maxit, an integer that fixes the maximum number of algorithm iterations.

Value

A list of class matchFeat with components

cluster

integer vector of cluster assignments (length = now(x))

objective

minimum objective value

mu

sample mean for each cluster/class (feature-by-cluster matrix)

V

sample covariance for each cluster/class (feature-by-feature-by-cluster 3D array)

size

integer vector of cluster sizes

call

function call

References

Degras (2022) "Scalable feature matching across large data collections." doi:10.1080/10618600.2022.2074429
Wright (2015). Coordinate descent algorithms. https://arxiv.org/abs/1502.04759

See Also

match.2x, match.bca, match.bca.gen, match.gaussmix, match.kmeans, match.rec, match.template

Examples

data(optdigits)
nobs <- nrow(optdigits$x) # total number of observations
n <- length(unique(optdigits$unit)) # number of statistical units
rmv <- sample.int(nobs, n-1) # remove (n-1) observations to make data unbalanced
min.m <- max(table(optdigits$unit[-rmv])) # smallest possible number of clusters
# lower values will result in an error message 
m <- min.m
result <- match.bca.gen(optdigits$x[-rmv,], optdigits$unit[-rmv], m)




[Package matchFeat version 1.0 Index]