sleepwalk {sleepwalk} | R Documentation |
Interactively explore one or several 2D embeddings
Description
A function to interactively explore a 2D embedding of some higher-dimensional point cloud, as produced by a dimension reduction method such as MDS, t-SNE, or the like.
Usage
sleepwalk(
embeddings,
featureMatrices = NULL,
maxdists = NULL,
pointSize = 1.5,
titles = NULL,
distances = NULL,
same = c("objects", "features"),
compare = c("embeddings", "distances"),
saveToFile = NULL,
ncol = NULL,
nrow = NULL,
on_selection = NULL,
mode = c("canvas", "svg"),
metric = "euclid",
...
)
Arguments
embeddings |
either an |
featureMatrices |
either an |
maxdists |
a vector of the maximum distances (in feature space) for each provided feature or distance matrix that should still be covered by the colour scale; higher distances are shown in light gray. This values can be changed later interactively. If not provided, maximum distances will be estimated automatically as median value of the distances. |
pointSize |
size of the points on the plots. |
titles |
a vector of titles for each embedding. Must be the same length as the list of
|
distances |
distances (in feature space) between points that should be displayed as colours.
This is an alternative to |
same |
defines what kind of distances to show; must be either |
compare |
defines what kind of comparison to perform; must be either |
saveToFile |
path to the .html file where to save the plots. The resulting page will be fully interactive
and contain all the data. If this is |
ncol |
number of columns in the table, where all the embeddings are placed. |
nrow |
number of rows in the table, where all the embeddings are placed. |
on_selection |
a callback function that is called every time the user selects a group of points in
the web browser. From the |
mode |
defines whether to use Canvas or SVG to display points. Using Canvas is faster and allows to plot
more points simultaneously, yet we currently consider SVG mode to be more stable and vigorously tested. In future
versions SVG mode will be deprecated. Must be one of |
metric |
specifies what metric to use to calculate distances from feature matrices. Currently only Euclidean
( |
... |
Further arguments passed to |
Details
The function opens a browser window and displays the embeddings as point clouds. When the user moves the mouse over a point, the point gets selected and all data points change colour such that their colour indicates the feature-space distance to the point under the mouse cursor. This allows to quickly and intuitively check how tight clusters are, how faithful the embedding is, and how similar the clusters are to each other.
Value
None.
Author(s)
Simon Anders, Svetlana Ovchinnikova
References
doi: 10.1101/603589
Examples
#generate cockscrew-shaped 3D data with 3 additional noisy dimensions
ts <- c(rnorm(100), rnorm(200, 5), rnorm(150, 13), runif(200, min = -5, max = 20))
a <- 3
w <- 1
points <- cbind(30*cos(w * ts), 30*sin(w * ts), a * ts)
ndim <- 6
noise <- cbind(matrix(rnorm(length(ts) * 3, sd = 5), ncol = 3),
matrix(rnorm(length(ts) * (ndim - 3), sd = 10), ncol = ndim - 3))
data <- noise
data[, 1:3] <- data[, 1:3] + points
pca <- prcomp(data)
#compare Euclidean distance with the real position on the helix
sleepwalk(list(pca$x[, 1:2], pca$x[, 1:2]), list(data, as.matrix(ts)),
compare = "distances", pointSize = 3)
#the same, but with saving the web page to an HTML file
sleepwalk(list(pca$x[, 1:2], pca$x[, 1:2]), list(data, as.matrix(ts)),
compare = "distances", pointSize = 3,
saveToFile = paste0(tempdir(), "/test.html"))