R: Perform 'omics wide association study

owas {epiomics}

R Documentation

Perform 'omics wide association study

Description

Implements an omics wide association study with the option of using the 'omics data as either the dependent variable (i.e., for performing an exposure –> 'omics analysis) or using the 'omics as the independent variable (i.e., for performing an 'omics –> outcome analysis). Allows for either continuous or dichotomous outcomes, and provides the option to adjust for covariates.

Usage

owas(
  df,
  var,
  omics,
  covars = NULL,
  var_exposure_or_outcome,
  family = "gaussian",
  confidence_level = 0.95,
  conf_int = FALSE,
  ref_group = NULL
)

Arguments

`df`	Dataset
`var`	Name of the variable or variables of interest- this is usually either an exposure variable or an outcome variable. Can be either continuous or dichotomous. For dichotomous variables, must set `family` to "binomial", and values must be either 0/1 or a factor with the first level representing the reference group. Can handle multiple variables, but they must all be of the same `family`.
`omics`	Names of all omics features in the dataset
`covars`	Names of covariates (can be NULL)
`var_exposure_or_outcome`	Is the variable of interest an exposure (independent variable) or outcome (dependent variable)? Must be either "exposure" or "outcome"
`family`	"gaussian" (default) for linear models (via lm) or "binomial" for logistic (via glm)
`confidence_level`	Confidence level for marginal significance (defaults to 0.95, or an alpha of 0.05)
`conf_int`	Should Confidence intervals be generated for the estimates? Default is FALSE. Setting to TRUE will take longer. For logistic models, calculates Wald confidence intervals via `confint.default`.
`ref_group`	Reference category if the variable of interest is a character or factor. If not, can leave empty.

Value

A data frame with 6 columns: feature_name: name of the omics feature estimate: the model estimate for the feature. For linear models, this is the beta; for logistic models, this is the log odds. se: Standard error of the estimate test_statistic: t-value p_value: p-value for the estimate adjusted_pval: FDR adjusted p-value threshold: Marginal significance, based on unadjusted p-values

Examples

# Load Example Data
data("example_data")

# Get names of omics
colnames_omic_fts <- colnames(example_data)[grep("feature_",
                                              colnames(example_data))][1:10]

# Get names of exposures
expnms = c("exposure1", "exposure2", "exposure3")

# Run function with one continuous exposure as the variable of interest
owas(df = example_data, 
     var = "exposure1", 
     omics = colnames_omic_fts, 
     covars = c("age", "sex"), 
     var_exposure_or_outcome = "exposure", 
     family = "gaussian")
     
# Run function with multiple continuous exposures as the variable of interest
owas(df = example_data, 
     var = expnms, 
     omics = colnames_omic_fts, 
     covars = c("age", "sex"), 
     var_exposure_or_outcome = "exposure", 
     family = "gaussian")

# Run function with dichotomous outcome as the variable of interest
owas(df = example_data, 
     var = "disease1", 
     omics = colnames_omic_fts, 
     covars = c("age", "sex"), 
     var_exposure_or_outcome = "outcome", 
     family = "binomial")

[Package epiomics version 1.1.0 Index]