merge_clin {SomaDataIO}R Documentation

Merge Clinical Data into SomaScan

Description

Occasionally, additional clinical data is obtained after samples have been submitted to SomaLogic, or even after 'SomaScan' results have been delivered. This requires the new clinical variables, i.e. non-proteomic, data to be merged with 'SomaScan' data into a "new" ADAT prior to analysis. merge_clin() easily merges such clinical variables into an existing soma_adat object and is a simple wrapper around dplyr::left_join().

Usage

merge_clin(x, clin_data, by = NULL, by_class = NULL, ...)

Arguments

x

A soma_adat object (with intact attributes), typically created using read_adat().

clin_data

One of 2 options:

  • a data frame containing clinical variables to merge into x, or

  • a path to a file, typically a ⁠*.csv⁠, containing clinical variables to merge into x.

by

A character vector of variables to join by. See dplyr::left_join() for more details.

by_class

If clin_data is a file path, a named character vector of the variable and its class. This ensures the "by-key" is compatible for the join. For example, c(SampleId = "character"). See read.table() for details about its colClasses argument, and also the examples below.

...

Additional parameters passed to dplyr::left_join().

Details

This functionality also exists as a command-line tool (R script) contained in merge_clin.R that lives in the cli/merge system file directory. Please see:

Value

A soma_adat with new clinical variables merged.

Author(s)

Stu Field

See Also

dplyr::left_join()

Examples

# retrieve clinical data
clin_file <- system.file("cli/merge", "meta.csv",
                         package = "SomaDataIO",
                         mustWork = TRUE)
clin_file

# view clinical data to be merged:
# 1) `group`
# 2) `newvar`
clin_df <- read.csv(clin_file, colClasses = c(SampleId = "character"))
clin_df

# create mini-adat
apts <- withr::with_seed(123, sample(getAnalytes(example_data), 2L))
adat <- head(example_data, 9L) |>   # 9 x 2
  dplyr::select(SampleId, all_of(apts))

# merge clinical variables
merged <- merge_clin(adat, clin_df, by = "SampleId")
merged

# Alternative syntax:
#   1) pass file path
#   2) merge on different variable names
#   3) convert join type on-the-fly
clin_file2 <- system.file("cli/merge", "meta2.csv",
                          package = "SomaDataIO",
                          mustWork = TRUE)

id_type <- typeof(adat$SampleId)
merged2 <- merge_clin(adat, clin_file2,                # file path
                      by = c(SampleId = "ClinKey"),    # join on 2 variables
                      by_class = c(ClinKey = id_type)) # match types
merged2

[Package SomaDataIO version 6.1.0 Index]