olink_umap_plot {OlinkAnalyze}R Documentation

Function to make a UMAP plot from the data

Description

Computes a manifold approximation and projection using umap::umap and plots the two specified components. Unique sample names are required and imputation by the median is done for assays with missingness <10% for multi-plate projects and <5% for single plate projects.

Usage

olink_umap_plot(
  df,
  color_g = "QC_Warning",
  x_val = 1,
  y_val = 2,
  config = NULL,
  label_samples = FALSE,
  drop_assays = FALSE,
  drop_samples = FALSE,
  byPanel = FALSE,
  outlierDefX = NA,
  outlierDefY = NA,
  outlierLines = FALSE,
  label_outliers = TRUE,
  quiet = FALSE,
  verbose = TRUE,
  ...
)

Arguments

df

data frame in long format with Sample Id, NPX and column of choice for colors

color_g

Character value indicating which column to use for colors (default QC_Warning)

x_val

Integer indicating which UMAP component to plot along the x-axis (default 1)

y_val

Integer indicating which UMAP component to plot along the y-axis (default 2)

config

object of class umap.config, specifying the parameters for the UMAP algorithm (default umap::umap.defaults)

label_samples

Logical. If TRUE, points are replaced with SampleID (default FALSE)

drop_assays

Logical. All assays with any missing values will be dropped. Takes precedence over sample drop.

drop_samples

Logical. All samples with any missing values will be dropped.

byPanel

Perform the UMAP per panel (default FALSE)

outlierDefX

The number standard deviations along the UMAP dimension plotted on the x-axis that defines an outlier. See also 'Details"

outlierDefY

The number standard deviations along the UMAP dimension plotted on the y-axis that defines an outlier. See also 'Details"

outlierLines

Draw dashed lines at +/-outlierDef[X,Y] standard deviations from the mean of the plotted PCs (default FALSE)

label_outliers

Use ggrepel to label samples lying outside the limits set by the outlierLines (default TRUE)

quiet

Logical. If TRUE, the resulting plot is not printed

verbose

Logical. Whether warnings about the number of samples and/or assays dropped or imputed should be printed to the console.

...

coloroption passed to specify color order.

Details

The plot is printed, and a list of ggplot objects is returned.

If byPanel = TRUE, the data processing (imputation of missing values etc) and subsequent UMAP is performed separately per panel. A faceted plot is printed, while the individual ggplot objects are returned.

The arguments outlierDefX and outlierDefY can be used to identify outliers in the UMAP results. Samples more than +/-outlierDef[X,Y] standard deviations from the mean of the plotted UMAP component will be labelled. Both arguments have to be specified. NOTE: UMAP is a non-linear data transformation that might not accurately preserve the properties of the data. Distances in the UMAP plane should therefore be interpreted with caution.

Value

A list of objects of class "ggplot", each plot contains scatter plot of UMAPs

Examples


library(dplyr)
npx_data <- npx_data1 %>%
    mutate(SampleID = paste(SampleID, "_", Index, sep = ""))
try({ # Requires umap package dependency
#UMAP using all the data
olink_umap_plot(df=npx_data, color_g = "QC_Warning")

#UMAP per panel
g <- olink_umap_plot(df=npx_data, color_g = "QC_Warning", byPanel = TRUE)
g$Inflammation #Plot only the Inflammation panel

#Label outliers
olink_umap_plot(df=npx_data, color_g = "QC_Warning",
               outlierDefX = 2, outlierDefY = 4) #All data
olink_umap_plot(df=npx_data, color_g = "QC_Warning",
               outlierDefX = 3, outlierDefY = 2, byPanel = TRUE) #Per panel

#Retrieve the outliers
g <- olink_umap_plot(df=npx_data, color_g = "QC_Warning",
                    outlierDefX = 3, outlierDefY = 2, byPanel = TRUE)
outliers <- lapply(g, function(x){x$data}) %>%
    bind_rows() %>%
    filter(Outlier == 1)
})


[Package OlinkAnalyze version 3.8.2 Index]