mcdr {PPCI} | R Documentation |
Maximum Clusterability Dimension Reduction
Description
Finds a linear projection of a data set using projection pursuit to maximise the variance ratio clusterability measured in each dimension separately.
Usage
mcdr(X, p, minsize, v0, verb, labels, maxit, ftol)
Arguments
X |
a numeric matrix (num_data x num_dimensions); the dataset. |
p |
an integer; the number of dimensions in the projection. |
v0 |
(optional) initial projection direction(s). a function(X) of the data, which returns a matrix with ncol(X) rows. each column of the output of v0(X) is used as an initialisation for projection pursuit. the solution with the minimum normalised cut is used within the final model. if omitted then a single initialisation is used for each column of the projection matrix; the first principal component within the null space of the other columns. |
verb |
(optional) verbosity level of optimisation procedure. verb==0 produces no output. verb==1 produces plots illustrating the progress of projection pursuit via plots of the projected data. verb==2 adds to these plots additional information about the progress. verb==3 creates a folder in working directory and stores all plots for verb==2. if omitted then verb==0. |
labels |
(optional) vector of class labels. not used in the actual projection pursuit. only used for illustrative purposes for values of verb>0. |
maxit |
(optional) maximum number of iterations in optimisation for each value of alpha. if omitted then maxit=15. |
ftol |
(optional) tolerance level for convergence of optimisation, based on relative function value improvements. if omitted then ftol = 1e-5. |
minsize |
(optional) the minimum number of data on each side of a hyperplane. if omitted then minsize = 1. |
Value
a named list with class ppci_projection_solution with the following components
$projection |
the num_dimensions x p projection matrix. |
$fitted |
the num_data x p projected data set. |
$data |
the input data matrix. |
$method |
=="MCDC". |
$params |
list of parameters used to find $projection. |
References
Hofmeyr, D., Pavlidis, N. (2015) Maximum Clusterability Divisive Clustering. Computational Intelligence, 2015 IEEE Symposium Series on, pp. 780–786.
Examples
### not run
run = FALSE
if(run){
## load optidigits dataset
data(optidigits)
## find nine dimensional projection (one fewer than
## the number of clusters, as is common in clustering)
sol <- mcdr(optidigits$x, 9)
## visualise the solution via the first 3 pairs of dimensions
plot(sol, pairs = 3, labels = optidigits$c)
## compare with PCA projection
pairs(optidigits$x%*%eigen(cov(optidigits$x))$vectors[,1:3], col = optidigits$c)
}