alsData {SEMgraph}R Documentation

Amyotrophic Lateral Sclerosis (ALS) dataset

Description

Expression profiling through high-throughput sequencing (RNA-seq) of 139 ALS patients and 21 healthy controls (HCs), from Tam et al. (2019).

Usage

alsData

Format

alsData is a list of 4 objects:

  1. "graph", ALS graph as the largest connected component of the "Amyotrophic lateral sclerosis (ALS)" pathway from KEGG database;

  2. "exprs", a matrix of 160 rows (subjects) and 318 columns (genes) extracted from the original 17695. This subset includes genes from KEGG pathways, needed to run SEMgraph examples. Raw data from the GEO dataset GSE124439 (Tam et al., 2019) were pre-processed applying batch effect correction, using the sva R package (Leek et al., 2012), to remove data production center and brain area biases. Using multidimensional scaling-based clustering, ALS-specific and an HC-specific clusters were generated. Misclassified samples were blacklisted and removed from the current dataset;

  3. "group", a binary group vector of 139 ALS subjects (1) and 21 healthy controls (0);

  4. "details", a data.frame reporting information about included and blacklisted samples.

Source

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE124439

References

Tam OH, Rozhkov NV, Shaw R, Kim D et al. (2019). Postmortem Cortex Samples Identify Distinct Molecular Subtypes of ALS: Retrotransposon Activation, Oxidative Stress, and Activated Glia. Cell Repprts, 29(5):1164-1177.e5. <https://doi.org/10.1016/j.celrep.2019.09.066>

Jeffrey T. Leek, W. Evan Johnson, Hilary S. Parker, Andrew E. Jaffe, and John D. Storey (2012). The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. Mar 15; 28(6): 882-883. <https://doi.org/10.1093/bioinformatics/bts034>

Examples

alsData$graph
dim(alsData$exprs)
table(alsData$group)


[Package SEMgraph version 1.2.2 Index]