apclusterL {apcluster} | R Documentation |
Leveraged Affinity Propagation
Description
Runs leveraged affinity propagation clustering
Usage
## S4 method for signature 'matrix,missing'
apclusterL(s, x,
sel, p=NA, q=NA, maxits=1000, convits=100, lam=0.9,
includeSim=FALSE, nonoise=FALSE, seed=NA)
## S4 method for signature 'character,ANY'
apclusterL(s, x,
frac, sweeps, p=NA, q=NA, maxits=1000, convits=100, lam=0.9,
includeSim=TRUE, nonoise=FALSE, seed=NA, ...)
## S4 method for signature 'function,ANY'
apclusterL(s, x,
frac, sweeps, p=NA, q=NA, maxits=1000, convits=100, lam=0.9,
includeSim=TRUE, nonoise=FALSE, seed=NA, ...)
Arguments
s |
an |
x |
input data to be clustered; if |
frac |
fraction of samples that should be used for leveraged clustering. The similarity matrix will be generated for all samples against a random fraction of the samples as specified by this parameter. |
sweeps |
number of sweeps of leveraged clustering performed with changing randomly selected subset of samples. |
sel |
selected sample indices; a vector containing the sample indices of the sample subset used for leveraged AP clustering in increasing order. |
p |
input preference; can be a vector that specifies
individual preferences for each data point. If scalar,
the same value is used for all data points. If |
q |
if |
maxits |
maximal number of iterations that should be executed |
convits |
the algorithm terminates if the examplars have not
changed for |
lam |
damping factor; should be a value in the range [0.5, 1); higher values correspond to heavy damping which may be needed if oscillations occur |
includeSim |
if |
nonoise |
|
seed |
for reproducibility, the seed of the random number
generator can be set to a fixed value before
adding noise (see above), if |
... |
all other arguments are passed to the selected
similarity function as they are; note that possible name conflicts between
arguments of |
Details
Affinity Propagation clusters data using a set of real-valued pairwise similarities as input. Each cluster is represented by a representative cluster center (the so-called exemplar). The method is iterative and searches for clusters maximizing an objective function called net similarity.
Leveraged Affinity Propagation reduces dynamic and static load for large datasets. Only a subset of the samples are considered in the clustering process assuming that they provide already enough information about the cluster structure.
When called with input data and the name of a package provided or a user
provided similarity function the function selects a random sample subset
according to the frac
parameter, calculates a rectangular
similarity matrix of all samples against this subset and repeats
affinity propagation sweep
times. A new sample subset is used
for each repetition. The clustering result of the sweep with the highest
net similarity is returned. Any parameters specific to the chosen
method of similarity calculation can be passed to apcluster
in addition to the parameters described above. The similarity matrix
for the best trial is also returned in the result object when requested
by the user (argument includeSim
).
When called with a rectangular similarity matrix (which represents a
column subset of the full similarity matrix) the function performs
AP clustering on this similarity matrix. The information
about the selected samples is passed to clustering with the
parameter sel
. This function is only needed when the user needs full
control of distance calculation or sample subset selection.
Apart from minor adaptations and optimizations, the implementation
of the function apclusterL
is largely analogous to Frey's and Dueck's Matlab code
(see https://psi.toronto.edu/research/affinity-propagation-clustering-by-message-passing/).
Value
Upon successful completion, both functions returns an
APResult
object.
Author(s)
Ulrich Bodenhofer, Andreas Kothmeier, and Johannes Palme
References
https://github.com/UBod/apcluster
Frey, B. J. and Dueck, D. (2007) Clustering by passing messages between data points. Science 315, 972-976. DOI: doi:10.1126/science.1136800.
Bodenhofer, U., Kothmeier, A., and Hochreiter, S. (2011) APCluster: an R package for affinity propagation clustering. Bioinformatics 27, 2463-2464. DOI: doi:10.1093/bioinformatics/btr406.
See Also
APResult
, show-methods
,
plot-methods
, labels-methods
,
preferenceRange
, apcluster-methods
,
apclusterK
Examples
## create two Gaussian clouds
cl1 <- cbind(rnorm(150, 0.2, 0.05), rnorm(150, 0.8, 0.06))
cl2 <- cbind(rnorm(100, 0.7, 0.08), rnorm(100, 0.3, 0.05))
x <- rbind(cl1, cl2)
## leveraged apcluster
apres <- apclusterL(negDistMat(r=2), x, frac=0.2, sweeps=3, p=-0.2)
## show details of leveraged clustering results
show(apres)
## plot leveraged clustering result
plot(apres, x)
## plot heatmap of clustering result
heatmap(apres)
## show net similarities of single sweeps
apres@netsimLev
## show samples on which best sweep was based
apres@sel