SomaScanObjects {SomaDataIO} | R Documentation |
Example Data and Objects
Description
The example_data
object is intended to provide existing and prospective
SomaLogic customers with example data to enable analysis preparation prior
to receipt of SomaScan data, and also for those generally curious about the
SomaScan data deliverable. It is not intended to be used as a control
group for studies or provide any metrics for SomaScan data in general.
Format
- example_data
a
soma_adat
parsed viaread_adat()
containing 192 samples (see below for breakdown of sample type). There are 5318 columns containing 5284 analyte features and 34 clinical meta data fields. These data have been pre-processed via the following steps:hybridization normalized (all samples)
calibrators and buffers median normalized
plate scaled
calibrated
Adaptive Normalization by Maximum Likelihood (ANML) of QC and clinical samples
Note1: The
Age
andSex
(M
/F
) fields contain simulated values designed to contain biological signal.**Note2:** The `SampleType` column contains sample source/type information and usually the `SampleType == Sample` represents the "client" samples. **Note3:** The original source file can be found at \url{https://github.com/SomaLogic/SomaLogic-Data}.
- ex_analytes
character string of the analyte features contained in the
soma_adat
object, derived from a call togetAnalytes()
.- ex_anno_tbl
a lookup table corresponding to a transposed data frame of the "Col.Meta" attribute of an ADAT, with an index key field
AptName
included in column 1, derived from a call togetAnalyteInfo()
.- ex_target_names
A lookup table mapping
SeqId
feature names -> target names contained inexample_data
. This object (or one like it) is convenient at the console via auto-complete for labeling and/or creating plot titles on the fly.
Data Description
The example_data
object contains a SomaScan V4 study from healthy
normal individuals. The RFU measurements themselves and other identifiers
have been altered to protect personally identifiable information (PII),
but also retain underlying biological signal as much as possible.
There are 192 total EDTA-plasma samples across two 96-well plate runs
which are broken down by the following types:
170 clinical samples (client study samples)
10 calibrators (replicate controls for combining data across runs)
6 QC samples (replicate controls used to assess run quality)
6 Buffer samples (no protein controls)
Data Processing
The standard V4 data normalization procedure for EDTA-plasma samples was applied to this dataset. For more details on the data standardization process see the Data Standardization and File Specification Technical Note. General details are outlined above.
Source
https://github.com/SomaLogic/SomaLogic-Data
SomaLogic Operating Co., Inc.
Examples
# S3 print method
example_data
# print header info
print(example_data, show_header = TRUE)
class(example_data)
# Features/Analytes
head(ex_analytes, 20L)
# Feature info table (annotations)
ex_anno_tbl
# Search via `filter()`
dplyr::filter(ex_anno_tbl, grepl("^MMP", Target))
# Lookup table -> targets
# MMP-9
ex_target_names$seq.2579.17
# gender hormone FSH
tapply(example_data$seq.3032.11, example_data$Sex, median)
# gender hormone LH
tapply(example_data$seq.2953.31, example_data$Sex, median)
# Target lookup
ex_target_names$seq.2953.31 # tab-completion at console
# Sample Type/Source
table(example_data$SampleType)
# Sex/Gender Variable
table(example_data$Sex)
# Age Variable
summary(example_data$Age)