extract_sample_qc_flags {ukbnmr}R Documentation

Extract NMR sample QC flags from a data.frame of UK Biobank fields

Description

Given an input data.frame loaded from a dataset extracted by ukbconv extracts the UK Biobank fields corresponding to the sample quality control flags for the NMR metabolomics biomarker data giving them short variable names.

Usage

extract_sample_qc_flags(x)

Arguments

x

data.frame with column names "eid" followed by extracted fields e.g. "23649-0.0", "23649-1.0", ..., "23655-1.0".

Details

Data sets extracted by ukbconv have one row per UKB biobank participant whose project specific sample identifier is given in the first column named "eid". Columns following this have the format "<field_id>-<visit_index>.<repeat_index>", where here <field_id> corresponds to a sample QC flag, and <visit_index> corresponds to the assessment time point, e.g. 0 for baseline assessment, 1 for first repeat visit. For the UKB NMR data, the <repeat_index> column is reserved for cases where biomarker measurements have more than one QC Flag (see extract_biomarker_qc_flags()).

In the returned data.frame there is single column for each QC Flag, with an additional column for the visit index. Rows are uniquely identifiable by the combination of entries in columns "eid" and "visit_index". There are currently no repeat measure data for the NMR biomarker data in UKB, so no repeat_index column is returned.

This function will also work with data extracted by the ukbtools R package.

Value

a data.frame or data.table with column names "eid" and "visit_index", followed by columns for each sample QC tag, e.g. "Shipment.Plate", ..., "Low.Protein".

Examples

ukb_data <- ukbnmr::test_data # Toy example dataset for testing package
sample_qc_flags <- extract_sample_qc_flags(ukb_data)


[Package ukbnmr version 2.2 Index]