meet_in_middle {epiomics}R Documentation

Perform 'omics wide association study

Description

Implements a meet in the middle analysis for identifying omics associated with both exposures and outcomes, as described by Chadeau-Hyam et al., 2010.

Usage

meet_in_middle(
  df,
  exposure,
  outcome,
  omics,
  covars = NULL,
  outcome_family = "gaussian",
  confidence_level = 0.95,
  conf_int = FALSE,
  ref_group_exposure = NULL,
  ref_group_outcome = NULL
)

Arguments

df

Dataframe

exposure

Name of the exposure of interest. Can be either continuous or dichotomous. Currently, only a single exposure is supported.

outcome

Name of the outcome of interest. Can be either continuous or dichotomous. For dichotomous variables, must set outcome_family to "logistic", and values must be either 0/1 or a factor with the first level representing the reference group. Currently, only a single outcome is supported.

omics

Names of all omics features in the dataset

covars

Names of covariates (can be NULL)

outcome_family

"gaussian" for linear models (via lm) or "binomial" for logistic (via glm)

confidence_level

Confidence level for marginal significance (defaults to 0.95)

conf_int

Should Confidence intervals be generated for the estimates? Default is FALSE. Setting to TRUE will take longer. For logistic models, calculates Wald confidence intervals via confint.default.

ref_group_exposure

Reference category if the exposure is a character or factor. If not, can leave empty.

ref_group_outcome

Reference category if the outcome is a character or factor. If not, can leave empty.

Value

A list of three dataframes, containing:

  1. Results from the Exposure-Omics Wide Association Study

  2. Results from the Omics-Outcome Wide Association Study

  3. Overlapping significant features from 1 and 2. For each omics wide association, results are provided in a data frame with 6 columns: feature_name: name of the omics feature estimate: the model estimate for the feature. For linear models, this is the beta: for logistic models, this is the log odds. se: Standard error of the estimate p_value: p-value for the estimate adjusted_pval: FDR adjusted p-value threshold: Marginal significance, based on unadjusted p-values

Examples

# Load Example Data
data("example_data")

# Get names of omics
colnames_omic_fts <- colnames(example_data)[grep("feature_",
                                              colnames(example_data))][1:10]

# Meet in the middle with a dichotomous outcome
res <- meet_in_middle(df = example_data,
                      exposure = "exposure1", 
                      outcome = "disease1", 
                      omics = colnames_omic_fts,
                      covars = c("age", "sex"), 
                      outcome_family = "binomial")

# Meet in the middle with a continuous outcome 
res <- meet_in_middle(df = example_data,
                      exposure = "exposure1", 
                      outcome = "weight", 
                      omics = colnames_omic_fts,
                      covars = c("age", "sex"), 
                      outcome_family = "gaussian")

# Meet in the middle with a continuous outcome and no covariates
res <- meet_in_middle(df = example_data,
                      exposure = "exposure1", 
                      outcome = "weight", 
                      omics = colnames_omic_fts,
                      outcome_family = "gaussian")


[Package epiomics version 1.1.0 Index]