CytomeTree {cytometree}R Documentation

Binary tree algorithm for cytometry data analysis.

Description

Binary tree algorithm for cytometry data analysis.

Usage

CytomeTree(M, minleaf = 1, t = 0.1, verbose = TRUE, force_first_markers = NULL)

Arguments

M

A matrix of size n x p containing cytometry measures of n cells on p markers.

minleaf

An integer indicating the minimum number of cells per population. Default is 1.

t

A real positive-or-null number used for comparison with the normalized AIC computed at each node of the tree. A higher value limits the height of the tree.

verbose

A logical controlling if a text progress bar is displayed during the execution of the algorithm. By default is TRUE.

force_first_markers

a vector of index to split the data on first. This argument is used in the semi-supervised setting, forcing the algorithm to consider those markers first, in the order they appear in this force_first_markers vector, and forcing the split at every node. Default is NULL, in which case the clustering algorithm is unsupervised.

Details

The algorithm is based on the construction of a binary tree, the nodes of which are subpopulations of cells. At each node, observed cells and markers are modeled by both a family of normal distributions and a family of bi-modal normal mixture distributions. Splitting is done according to a normalized difference of AIC between the two families.

Value

An object of class 'cytomeTree' providing a partitioning of the set of n cells.

Author(s)

Chariff Alkhassim, Boris Hejblum

Examples

head(DLBCL)

# number of cell event
N <- nrow(DLBCL)

# Cell events
cellevents <- DLBCL[, c("FL1", "FL2", "FL4")]


# Manual partitioning of the set N (from FlowCAP-I)
manual_labels <- DLBCL[, "label"]


# Build the binary tree
Tree <- CytomeTree(cellevents, minleaf = 1, t=.1)


# Retreive the resulting partition of the set N
Tree_Partition <- Tree$labels


# Plot node distributions
par(mfrow=c(1, 2))
plot_nodes(Tree)

# Choose a node to plot
plot_nodes(Tree,"FL4.1")

# Plot a graph of the tree
par(mfrow=c(1,1))
plot_graph(Tree,edge.arrow.size=.3, Vcex =.5, vertex.size = 30)

# Run the annotation algorithm
Annot <- Annotation(Tree,plot=FALSE)
Annot$combinations


# Compare to the annotation gotten from the tree
Tree$annotation


# Example of sought phenotypes
# Variable in which sought phenotypes can be entered in the form of matrices.
phenotypes <- list()

# Sought phenotypes:
## FL2+ FL4-.
phenotypes[[1]] <- rbind(c("FL2", 1), c("FL4", 0))

## FL2- FL4+.
phenotypes[[2]] <- rbind(c("FL2", 0), c("FL4", 1))

## FL2+ FL4+.
phenotypes[[3]] <- rbind(c("FL2", 1), c("FL4", 1))

# Retreive cell populations found using Annotation.
PhenoInfos <- RetrievePops(Annot, phenotypes)
PhenoInfos$phenotypesinfo

# F-measure ignoring cells labeled 0 as in FlowCAP-I.

# Use FmeasureC() in any other case.
FmeasureC_no0(ref=manual_labels, pred=Tree_Partition)



if(interactive()){

# Scatterplots.
library(ggplot2)

# Ignoring cells labeled 0 as in FlowCAP-I.
rm_zeros <- which(!manual_labels)

# Building the data frame to scatter plot the data.
FL1 <- cellevents[-c(rm_zeros),"FL1"]
FL2 <- cellevents[-c(rm_zeros),"FL2"]
FL4 <- cellevents[-c(rm_zeros),"FL4"]
n <- length(FL1)
Labels <- c(manual_labels[-c(rm_zeros)]%%2+1, Tree_Partition[-c(rm_zeros)])
Labels <- as.factor(Labels)
method <- as.factor(c(rep("FlowCap-I",n),rep("CytomeTree",n)))

scatter_df <- data.frame("FL2" = FL2, "FL4" = FL4, "labels" = Labels, "method" = method)
p <- ggplot2::ggplot(scatter_df,  ggplot2::aes_string(x = "FL2", y = "FL4", colour = "labels")) +
 ggplot2::geom_point(alpha = 1,cex = 1) +
 ggplot2::scale_colour_manual(values = c("green","red","blue")) +
 ggplot2::facet_wrap(~ method) +
 ggplot2::theme_bw() +
 ggplot2::theme(legend.position="bottom")
p

}

[Package cytometree version 2.0.2 Index]