R: Example Data and Objects

SomaScanObjects {SomaDataIO}

R Documentation

Example Data and Objects

Description

The example_data object is intended to provide existing and prospective SomaLogic customers with example data to enable analysis preparation prior to receipt of SomaScan data, and also for those generally curious about the SomaScan data deliverable. It is not intended to be used as a control group for studies or provide any metrics for SomaScan data in general.

Format

example_data

a soma_adat parsed via read_adat() containing 192 samples (see below for breakdown of sample type). There are 5318 columns containing 5284 analyte features and 34 clinical meta data fields. These data have been pre-processed via the following steps:

hybridization normalized (all samples)
calibrators and buffers median normalized
plate scaled
calibrated
Adaptive Normalization by Maximum Likelihood (ANML) of QC and clinical samples

Note1: The Age and Sex (M/F) fields contain simulated values designed to contain biological signal.

**Note2:** The `SampleType` column contains sample source/type information
and usually the `SampleType == Sample` represents the "client" samples.

**Note3:** The original source file can be found at
\url{https://github.com/SomaLogic/SomaLogic-Data}.

ex_analytes

character string of the analyte features contained in the soma_adat object, derived from a call to getAnalytes().

ex_anno_tbl

a lookup table corresponding to a transposed data frame of the "Col.Meta" attribute of an ADAT, with an index key field AptName included in column 1, derived from a call to getAnalyteInfo().

ex_target_names

A lookup table mapping SeqId feature names -> target names contained in example_data. This object (or one like it) is convenient at the console via auto-complete for labeling and/or creating plot titles on the fly.

Data Description

The example_data object contains a SomaScan V4 study from healthy normal individuals. The RFU measurements themselves and other identifiers have been altered to protect personally identifiable information (PII), but also retain underlying biological signal as much as possible. There are 192 total EDTA-plasma samples across two 96-well plate runs which are broken down by the following types:

170 clinical samples (client study samples)
10 calibrators (replicate controls for combining data across runs)
6 QC samples (replicate controls used to assess run quality)
6 Buffer samples (no protein controls)

Data Processing

The standard V4 data normalization procedure for EDTA-plasma samples was applied to this dataset. For more details on the data standardization process see the Data Standardization and File Specification Technical Note. General details are outlined above.

Source

https://github.com/SomaLogic/SomaLogic-Data

SomaLogic Operating Co., Inc.

Examples

# S3 print method
example_data

# print header info
print(example_data, show_header = TRUE)

class(example_data)

# Features/Analytes
head(ex_analytes, 20L)

# Feature info table (annotations)
ex_anno_tbl

# Search via `filter()`
dplyr::filter(ex_anno_tbl, grepl("^MMP", Target))

# Lookup table -> targets
# MMP-9
ex_target_names$seq.2579.17

# gender hormone FSH
tapply(example_data$seq.3032.11, example_data$Sex, median)

# gender hormone LH
tapply(example_data$seq.2953.31, example_data$Sex, median)

# Target lookup
ex_target_names$seq.2953.31     # tab-completion at console

# Sample Type/Source
table(example_data$SampleType)

# Sex/Gender Variable
table(example_data$Sex)

# Age Variable
summary(example_data$Age)

[Package SomaDataIO version 6.1.0 Index]