RunPCA.PSI {MARVEL}R Documentation

Principle component analysis for splicing data

Description

Performs principle component analysis using PSI values.

Usage

RunPCA.PSI(
  MarvelObject,
  sample.ids = NULL,
  cell.group.column,
  cell.group.order,
  cell.group.colors = NULL,
  features,
  min.cells = 25,
  point.size = 0.5,
  point.alpha = 0.75,
  point.stroke = 0.1,
  seed = 1,
  method.impute = "random",
  cell.group.column.impute = NULL
)

Arguments

MarvelObject

Marvel object. S3 object generated from TransformExpValues function.

sample.ids

Character strings. Specific cells to plot.

cell.group.column

Character string. The name of the sample metadata column in which the variables will be used to label the cell groups on the PCA.

cell.group.order

Character string. The order of the variables under the sample metadata column specified in cell.group.column to appear in the PCA cell group legend.

cell.group.colors

Character string. Vector of colors for the cell groups specified for PCA analysis using cell.type.columns and cell.group.order. If not specified, default ggplot2 colors will be used.

features

Character string. Vector of tran_id for analysis. Should match tran_id column of MarvelObject$ValidatedSpliceFeature.

min.cells

Numeric value. The minimum no. of cells expressing the splicing event to be included for analysis.

point.size

Numeric value. Size of data points on reduced dimension space.

point.alpha

Numeric value. Transparency of the data points on reduced dimension space. Take any values between 0 to 1. The smaller the value, the more transparent the data points will be.

point.stroke

Numeric value. The thickness of the outline of the data points. The larger the value, the thicker the outline of the data points.

seed

Numeric value. Ensures imputed values for NA PSIs are reproducible.

method.impute

Character string. Indicate the method for imputing missing PSI values (low coverage). "random" method randomly assigns any values between 0-1. "population.mean" method uses the mean PSI value for each cell population. Default option is "population.mean".

cell.group.column.impute

Character string. Only applicable when method.impute set to "population.mean". The name of the sample metadata column in which the variables will be used to impute missing values.

Value

An object of class S3 containing with new slots MarvelObject$PCA$PSI$Results and MarvelObject$PCA$PSI$Plot

Examples

marvel.demo <- readRDS(system.file("extdata/data", "marvel.demo.rds", package="MARVEL"))

# Define splicing events for analysis
df <- do.call(rbind.data.frame, marvel.demo$PSI)
tran_ids <- df$tran_id

# PCA
marvel.demo <- RunPCA.PSI(MarvelObject=marvel.demo,
                          sample.ids=marvel.demo$SplicePheno$sample.id,
                          cell.group.column="cell.type",
                          cell.group.order=c("iPSC", "Endoderm"),
                          cell.group.colors=NULL,
                          min.cells=5,
                          features=tran_ids,
                          point.size=2
                          )

# Check outputs
head(marvel.demo$PCA$PSI$Results$ind$coord)
marvel.demo$PCA$PSI$Plot

[Package MARVEL version 1.4.0 Index]