pbmc_facs {fastglmpca} | R Documentation |
Mixture of 10 FACS-purified PBMC Single-Cell RNA-seq data
Description
These data are a selection of the reference transcriptome profiles generated via single-cell RNA sequencing (RNA-seq) of 10 bead-enriched subpopulations of PBMCs (Donor A), described in Zheng et al (2017). The data are unique molecular identifier (UMI) counts for 16,791 genes in 3,774 cells. (Genes with no expression in any of the cells were removed.) Since the majority of the UMI counts are zero, they are efficiently stored as a 16,791 x 3774 sparse matrix. These data are used in the vignette illustrating how ‘fastglmpca’ can be used to analyze single-cell RNA-seq data. Data for a separate set of 1,000 cells is provided as a “test set” to evaluate out-of-sample predictions.
Format
pbmc_facs
is a list with the following elements:
- counts
16,791 x 3,774 sparse matrix of UMI counts, with rows corresponding to genes and columns corresponding to cells (samples). It is an object of class
"dgCMatrix"
).- counts_test
UMI counts for an additional test set of 100 cells.
- samples
Data frame containing information about the samples, including cell barcode and source FACS population (“celltype” and “facs_subpop”).
- samples_test
Sample information for the additional test set of 100 cells.
- genes
Data frame containing information and the genes, including gene symbol and Ensembl identifier.
- fit
GLM-PCA model that was fit to the UMI count data in the vignette.
Source
https://www.10xgenomics.com/resources/datasets
References
G. X. Y. Zheng et al (2017). Massively parallel digital transcriptional profiling of single cells. Nature Communications 8, 14049. doi:10.1038/ncomms14049
Examples
library(Matrix)
data(pbmc_facs)
cat(sprintf("Number of genes: %d\n",nrow(pbmc_facs$counts)))
cat(sprintf("Number of cells: %d\n",ncol(pbmc_facs$counts)))
cat(sprintf("Proportion of counts that are non-zero: %0.1f%%.\n",
100*mean(pbmc_facs$counts > 0)))