olink_umap_plot {OlinkAnalyze} | R Documentation |
Function to make a UMAP plot from the data
Description
Computes a manifold approximation and projection using umap::umap and plots the two specified components. Unique sample names are required and imputation by the median is done for assays with missingness <10% for multi-plate projects and <5% for single plate projects.
Usage
olink_umap_plot(
df,
color_g = "QC_Warning",
x_val = 1,
y_val = 2,
config = NULL,
label_samples = FALSE,
drop_assays = FALSE,
drop_samples = FALSE,
byPanel = FALSE,
outlierDefX = NA,
outlierDefY = NA,
outlierLines = FALSE,
label_outliers = TRUE,
quiet = FALSE,
verbose = TRUE,
...
)
Arguments
df |
data frame in long format with Sample Id, NPX and column of choice for colors |
color_g |
Character value indicating which column to use for colors (default QC_Warning) |
x_val |
Integer indicating which UMAP component to plot along the x-axis (default 1) |
y_val |
Integer indicating which UMAP component to plot along the y-axis (default 2) |
config |
object of class umap.config, specifying the parameters for the UMAP algorithm (default umap::umap.defaults) |
label_samples |
Logical. If TRUE, points are replaced with SampleID (default FALSE) |
drop_assays |
Logical. All assays with any missing values will be dropped. Takes precedence over sample drop. |
drop_samples |
Logical. All samples with any missing values will be dropped. |
byPanel |
Perform the UMAP per panel (default FALSE) |
outlierDefX |
The number standard deviations along the UMAP dimension plotted on the x-axis that defines an outlier. See also 'Details" |
outlierDefY |
The number standard deviations along the UMAP dimension plotted on the y-axis that defines an outlier. See also 'Details" |
outlierLines |
Draw dashed lines at +/-outlierDef[X,Y] standard deviations from the mean of the plotted PCs (default FALSE) |
label_outliers |
Use ggrepel to label samples lying outside the limits set by the outlierLines (default TRUE) |
quiet |
Logical. If TRUE, the resulting plot is not printed |
verbose |
Logical. Whether warnings about the number of samples and/or assays dropped or imputed should be printed to the console. |
... |
coloroption passed to specify color order. |
Details
The plot is printed, and a list of ggplot objects is returned.
If byPanel = TRUE, the data processing (imputation of missing values etc) and subsequent UMAP is performed separately per panel. A faceted plot is printed, while the individual ggplot objects are returned.
The arguments outlierDefX and outlierDefY can be used to identify outliers in the UMAP results. Samples more than +/-outlierDef[X,Y] standard deviations from the mean of the plotted UMAP component will be labelled. Both arguments have to be specified.
NOTE: UMAP is a non-linear data transformation that might not accurately preserve the properties of the data. Distances in the UMAP plane should therefore be interpreted with caution.
Value
A list of objects of class "ggplot", each plot contains scatter plot of UMAPs
Examples
library(dplyr)
npx_data <- npx_data1 %>%
mutate(SampleID = paste(SampleID, "_", Index, sep = ""))
try({ # Requires umap package dependency
#UMAP using all the data
olink_umap_plot(df=npx_data, color_g = "QC_Warning")
#UMAP per panel
g <- olink_umap_plot(df=npx_data, color_g = "QC_Warning", byPanel = TRUE)
g$Inflammation #Plot only the Inflammation panel
#Label outliers
olink_umap_plot(df=npx_data, color_g = "QC_Warning",
outlierDefX = 2, outlierDefY = 4) #All data
olink_umap_plot(df=npx_data, color_g = "QC_Warning",
outlierDefX = 3, outlierDefY = 2, byPanel = TRUE) #Per panel
#Retrieve the outliers
g <- olink_umap_plot(df=npx_data, color_g = "QC_Warning",
outlierDefX = 3, outlierDefY = 2, byPanel = TRUE)
outliers <- lapply(g, function(x){x$data}) %>%
bind_rows() %>%
filter(Outlier == 1)
})