R: Perform 'omics wide association study

meet_in_middle {epiomics}

R Documentation

Perform 'omics wide association study

Description

Implements a meet in the middle analysis for identifying omics associated with both exposures and outcomes, as described by Chadeau-Hyam et al., 2010.

Usage

meet_in_middle(
  df,
  exposure,
  outcome,
  omics,
  covars = NULL,
  outcome_family = "gaussian",
  confidence_level = 0.95,
  conf_int = FALSE,
  ref_group_exposure = NULL,
  ref_group_outcome = NULL
)

Arguments

`df`	Dataframe
`exposure`	Name of the exposure of interest. Can be either continuous or dichotomous. Currently, only a single exposure is supported.
`outcome`	Name of the outcome of interest. Can be either continuous or dichotomous. For dichotomous variables, must set `outcome_family` to "logistic", and values must be either 0/1 or a factor with the first level representing the reference group. Currently, only a single outcome is supported.
`omics`	Names of all omics features in the dataset
`covars`	Names of covariates (can be NULL)
`outcome_family`	"gaussian" for linear models (via lm) or "binomial" for logistic (via glm)
`confidence_level`	Confidence level for marginal significance (defaults to 0.95)
`conf_int`	Should Confidence intervals be generated for the estimates? Default is FALSE. Setting to TRUE will take longer. For logistic models, calculates Wald confidence intervals via `confint.default`.
`ref_group_exposure`	Reference category if the exposure is a character or factor. If not, can leave empty.
`ref_group_outcome`	Reference category if the outcome is a character or factor. If not, can leave empty.

Value

A list of three dataframes, containing:

Results from the Exposure-Omics Wide Association Study
Results from the Omics-Outcome Wide Association Study
Overlapping significant features from 1 and 2. For each omics wide association, results are provided in a data frame with 6 columns: feature_name: name of the omics feature estimate: the model estimate for the feature. For linear models, this is the beta: for logistic models, this is the log odds. se: Standard error of the estimate p_value: p-value for the estimate adjusted_pval: FDR adjusted p-value threshold: Marginal significance, based on unadjusted p-values

Examples

# Load Example Data
data("example_data")

# Get names of omics
colnames_omic_fts <- colnames(example_data)[grep("feature_",
                                              colnames(example_data))][1:10]

# Meet in the middle with a dichotomous outcome
res <- meet_in_middle(df = example_data,
                      exposure = "exposure1", 
                      outcome = "disease1", 
                      omics = colnames_omic_fts,
                      covars = c("age", "sex"), 
                      outcome_family = "binomial")

# Meet in the middle with a continuous outcome 
res <- meet_in_middle(df = example_data,
                      exposure = "exposure1", 
                      outcome = "weight", 
                      omics = colnames_omic_fts,
                      covars = c("age", "sex"), 
                      outcome_family = "gaussian")

# Meet in the middle with a continuous outcome and no covariates
res <- meet_in_middle(df = example_data,
                      exposure = "exposure1", 
                      outcome = "weight", 
                      omics = colnames_omic_fts,
                      outcome_family = "gaussian")

[Package epiomics version 1.1.0 Index]