pbmc_facs {fastglmpca}R Documentation

Mixture of 10 FACS-purified PBMC Single-Cell RNA-seq data

Description

These data are a selection of the reference transcriptome profiles generated via single-cell RNA sequencing (RNA-seq) of 10 bead-enriched subpopulations of PBMCs (Donor A), described in Zheng et al (2017). The data are unique molecular identifier (UMI) counts for 16,791 genes in 3,774 cells. (Genes with no expression in any of the cells were removed.) Since the majority of the UMI counts are zero, they are efficiently stored as a 16,791 x 3774 sparse matrix. These data are used in the vignette illustrating how ‘fastglmpca’ can be used to analyze single-cell RNA-seq data. Data for a separate set of 1,000 cells is provided as a “test set” to evaluate out-of-sample predictions.

Format

pbmc_facs is a list with the following elements:

counts

16,791 x 3,774 sparse matrix of UMI counts, with rows corresponding to genes and columns corresponding to cells (samples). It is an object of class "dgCMatrix").

counts_test

UMI counts for an additional test set of 100 cells.

samples

Data frame containing information about the samples, including cell barcode and source FACS population (“celltype” and “facs_subpop”).

samples_test

Sample information for the additional test set of 100 cells.

genes

Data frame containing information and the genes, including gene symbol and Ensembl identifier.

fit

GLM-PCA model that was fit to the UMI count data in the vignette.

Source

https://www.10xgenomics.com/resources/datasets

References

G. X. Y. Zheng et al (2017). Massively parallel digital transcriptional profiling of single cells. Nature Communications 8, 14049. doi:10.1038/ncomms14049

Examples

library(Matrix)
data(pbmc_facs)
cat(sprintf("Number of genes: %d\n",nrow(pbmc_facs$counts)))
cat(sprintf("Number of cells: %d\n",ncol(pbmc_facs$counts)))
cat(sprintf("Proportion of counts that are non-zero: %0.1f%%.\n",
            100*mean(pbmc_facs$counts > 0)))


[Package fastglmpca version 0.1-103 Index]