tSNE {ProjectionBasedClustering} | R Documentation |
T-distributed Stochastic Neighbor Embedding (t-SNE)
Description
T-distributed Stochastic Neighbor Embedding res = tSNE(Data, KNN=30,OutputDimension=2)
Usage
tSNE(DataOrDistances,k,OutputDimension=2,Algorithm='tsne_cpp',
method="euclidean",Whitening=FALSE, Iterations=1000,PlotIt=FALSE,Cls,num_threads=1,...)
Arguments
DataOrDistances |
Numerical matrix defined as either
or
|
k |
number of k nearest neighbors=number of effective nearest neighbors("perplexity"); Important parameter. If not given, settings of packages of t-SNE will be used depending |
OutputDimension |
Number of dimensions in the Outputspace, default=2 |
Algorithm |
'tsne_cpp': T-Distributed Stochastic Neighbor Embedding using a Barnes-HutImplementation in C++ of Rtsne. Requires Version >= 0.15 of Rtsne for multicore parallelisation. 'tsne_opt_cpp': T-Distributed Stochastic Neighbor Embedding with automated optimized parameters using a Barnes-HutImplementation in C++ of [Ulyanov, 2016]. 'tsne_r': pure R implementation of the t-SNE algorithm of of tsne |
method |
method specified by distance string: 'euclidean','cityblock=manhatten','cosine','chebychev','jaccard','minkowski','manhattan','binary' |
Whitening |
A boolean value indicating whether the matrix data should be whitened (tsne_r) or if pca should be used priorly (tsne_cpp) |
Iterations |
maximum number of iterations to perform. |
PlotIt |
Default: FALSE, If TRUE: Plots the projection as a 2d visualization. OutputDimension>2: only the first two dimensions will be shown |
Cls |
[1:n,1] Optional,: only relevant if PlotIt=TRUE. Numeric vector, given Classification in numbers: every element is the cluster number of a certain corresponding element of data. |
num_threads |
Number of threads for parallel computation, only usable for Algorithm='tsne_cpp' or 'tsne_opt_cpp' |
... |
Further arguments passed on to either 'Rtsne' or 'tsne' |
Details
An short overview of different types of projection methods can be found in [Thrun, 2018, p.42, Fig. 4.1], doi:10.1007/978-3-658-20540-9.
Value
List of
ProjectedPoints |
[1:n,OutputDimension], n by OutputDimension matrix containing coordinates of the Projection |
ModelObject |
NULL for tsne_r, further information if tsne_cpp is selected |
Note
A wrapper for Rtsne
(Algorithm='tsne_cpp'),
Multicore-opt-tSNE (Algorithm='tsne_opt_cpp'),
or for tsne
(Algorithm='tsne_r')
You can use the standard ShepardScatterPlot
or the better approach through the ShepardDensityPlot
of the CRAN package DataVisualizations
.
Author(s)
Michael Thrun, Luca Brinkmann
References
Anna C. Belkina, Christopher O. Ciccolella, Rina Anno, Josef Spidlen, Richard Halpert, Jennifer Snyder-Cappione: Automated optimal parameters for T-distributed stochastic neighbor embedding improve visualization and allow analysis of large datasets, bioRxiv 451690, doi: https://doi.org/10.1101/451690, 2018.
L.J.P van der Maaten: Accelerating t-SNE using tree-based algorithms, Journal of Machine Learning Research 15.1:3221-3245, 2014.
Ulyanov, Dmitry: Multicore-TSNE, GitHub repository URL https://github.com/DmitryUlyanov/Multicore-TSNE, 2016.
Examples
data('Hepta')
Data=Hepta$Data
## Not run:
Proj=tSNE(Data,k=7)
PlotProjectedPoints(Proj$ProjectedPoints,Hepta$Cls)
## End(Not run)