mdh {PPCI}R Documentation

Minimum Density Hyperplane

Description

Finds minimum density hyperplane(s) for clustering.

Usage

mdh(X, v0, minsize, bandwidth, alphamin, alphamax, verb, labels, maxit, ftol)

Arguments

X

a numeric matrix (num_data x num_dimensions); the dataset to be clustered.

v0

(optional) initial projection direction(s). a matrix with ncol(X) rows. each column of v0 is used as an initialisation for projection pursuit. if omitted then a single initialisation is used; the first principal component.

minsize

(optional) minimum cluster size. if omitted then minsize = 1.

bandwidth

(optional) positive numeric bandwidth parameter (h) for the kernel density estimator. if omitted then bandwidth = 0.9*eigen(cov(X))$values[1]^.5*nrow(X)^(-0.2).

alphamin

(optional) initial (scaled) bound on the distance of the optimal hyperplane from the mean of the data. if omitted then alphamin = 0.

alphamax

(optional) maximum/final (scaled) distance of the optimal hyperplane from the mean of the data. if omitted then alphamax = 1.

verb

(optional) verbosity level of optimisation procedure. verb==0 produces no output. verb==1 produces plots illustrating the progress of projection pursuit via plots of the projected data. verb==2 adds to these plots additional information about the progress. verb==3 creates a folder in working directory and stores all plots for verb==2. if omitted then verb==0.

labels

(optional) vector of class labels. not used in the actual clustering procedure. only used for illustrative purposes for values of verb>0.

maxit

(optional) maximum number of iterations in optimisation for each value of alpha. if omitted then maxit=15.

ftol

(optional) tolerance level for convergence of optimisation, based on relative function value improvements. if omitted then ftol = 1e-5.

Value

a named list with class ppci_hyperplane_solution with the following components

$cluster

cluster assignment vector.

$v

the optimal projection vector.

$b

the value of b making H(v, b) the minimum density hyperplane.

$fitted

data projected into two dimensional subspace defined by $v and the principal component in the null space of $v.

$data

the input data matrix.

$rel.dep

the relative depth of H(v, b).

$fval

the integrated dentsity on H(v, b).

$method

=="MDH".

$params

list of parameters used to find H(v, b).

$alternatives

an unnamed list. If more than one initilisation is considered, the alternatives to the best are stored in this field.

References

Pavlidis N.G., Hofmeyr D.P., Tasoulis S.K. (2016) Minimum Density Hyperplanes. Journal of Machine Learning Research, 17(156), 1–33.

Examples

## load breast cancer dataset
data(breastcancer)

## find minimum density hyperplane
sol <- mdh(breastcancer$x)

## visualise the solution
plot(sol)

## evaluate the quality of the partition
success_ratio(sol$cluster, breastcancer$c)

[Package PPCI version 0.1.5 Index]