phate {phateR} | R Documentation |
Run PHATE on an input data matrix
Description
PHATE is a data reduction method specifically designed for visualizing high dimensional data in low dimensional spaces.
Usage
phate(
data,
ndim = 2,
knn = 5,
decay = 40,
n.landmark = 2000,
gamma = 1,
t = "auto",
mds.solver = "sgd",
knn.dist.method = "euclidean",
knn.max = NULL,
init = NULL,
mds.method = "metric",
mds.dist.method = "euclidean",
t.max = 100,
npca = 100,
plot.optimal.t = FALSE,
verbose = 1,
n.jobs = 1,
seed = NULL,
potential.method = NULL,
k = NULL,
alpha = NULL,
use.alpha = NULL,
...
)
Arguments
data |
matrix (n_samples, n_dimensions)
2 dimensional input data array with
n_samples samples and n_dimensions dimensions.
If |
ndim |
int, optional, default: 2 number of dimensions in which the data will be embedded |
knn |
int, optional, default: 5 number of nearest neighbors on which to build kernel |
decay |
int, optional, default: 40 sets decay rate of kernel tails. If NULL, alpha decaying kernel is not used |
n.landmark |
int, optional, default: 2000 number of landmarks to use in fast PHATE |
gamma |
float, optional, default: 1
Informational distance constant between -1 and 1.
|
t |
int, optional, default: 'auto' power to which the diffusion operator is powered sets the level of diffusion |
mds.solver |
'sgd', 'smacof', optional, default: 'sgd' which solver to use for metric MDS. SGD is substantially faster, but produces slightly less optimal results. Note that SMACOF was used for all figures in the PHATE paper. |
knn.dist.method |
string, optional, default: 'euclidean'.
recommended values: 'euclidean', 'cosine', 'precomputed'
Any metric from |
knn.max |
int, optional, default: NULL
Maximum number of neighbors for which alpha decaying kernel
is computed for each point. For very large datasets, setting |
init |
phate object, optional object to use for initialization. Avoids recomputing intermediate steps if parameters are the same. |
mds.method |
string, optional, default: 'metric' choose from 'classic', 'metric', and 'nonmetric' which MDS algorithm is used for dimensionality reduction |
mds.dist.method |
string, optional, default: 'euclidean' recommended values: 'euclidean' and 'cosine' |
t.max |
int, optional, default: 100. Maximum value of t to test for automatic t selection. |
npca |
int, optional, default: 100 Number of principal components to use for calculating neighborhoods. For extremely large datasets, using n_pca < 20 allows neighborhoods to be calculated in log(n_samples) time. |
plot.optimal.t |
boolean, optional, default: FALSE If TRUE, produce a plot showing the Von Neumann Entropy curve for automatic t selection. |
verbose |
|
n.jobs |
|
seed |
int or |
potential.method |
Deprecated.
For log potential, use |
k |
Deprecated. Use |
alpha |
Deprecated. Use |
use.alpha |
Deprecated
To disable alpha decay, use |
... |
Additional arguments for |
Value
"phate" object containing:
-
embedding: the PHATE embedding
-
operator: The PHATE operator (python phate.PHATE object)
-
params: Parameters passed to phate
Examples
if (reticulate::py_module_available("phate")) {
# Load data
# data(tree.data)
# We use a smaller tree to make examples run faster
data(tree.data.small)
# Run PHATE
phate.tree <- phate(tree.data.small$data)
summary(phate.tree)
## PHATE embedding
## knn = 5, decay = 40, t = 58
## Data: (3000, 100)
## Embedding: (3000, 2)
library(graphics)
# Plot the result with base graphics
plot(phate.tree, col=tree.data.small$branches)
# Plot the result with ggplot2
if (require(ggplot2)) {
ggplot(phate.tree) +
geom_point(aes(x=PHATE1, y=PHATE2, color=tree.data.small$branches))
}
# Run PHATE again with different parameters
# We use the last run as initialization
phate.tree2 <- phate(tree.data.small$data, t=150, init=phate.tree)
# Extract the embedding matrix to use in downstream analysis
embedding <- as.matrix(phate.tree2)
}