IdentifyAssociatedPCs {MRPC} | R Documentation |
Identifyprincipal components (PCs) that are significantly associated with eQTLs and genes
Description
This function identifies PCs that are significantly associated the eQTLs or genes, and merge the associated PCs with the data on the eQTL and genes. PCs may be derived from Principal Component Analysis (PCA) of the entire gene expression matrix, and may be viewed as potential confounders in the sequent causal network analysis on the eQTLs and genes. See details in Badsha and Fu (2019) and Badsha et al. (2021).
Usage
IdentifyAssociatedPCs(PCs.matrix,no.PCs,data,fdr.level,corr.threshold,corr.value)
Arguments
PCs.matrix |
A matrix of PCs. |
no.PCs |
Number of top PCs to test for association. The default is 10. |
data |
Data of the eQTLs and genes, containing the genotypes of the eQTLs and the expression of the genes. |
fdr.level |
(optional) The false discover rate (FDR) for association tests. Must be in (0,1]. The default is "0.05". |
corr.threshold |
(optional). The default is "FALSE". If "TRUE" then a constraint on the correlation between a PC and an eQTL or a gene is applied in addition to the FDR control. |
corr.value |
The threshold for the Pearson correlation between a PC and an eQTL or a gene when |
Value
A list of object that containing the following:
-
AssociatedPCs
: All the PCs that are significantly associated with the eQTLs and genes. -
data.withPC
: The data matrix that contains eQTLs, gene expression, and associated PCs. -
corr.PCs
: The matrix of correlations between PCs and eQTLs/genes. -
PCs.asso.list
: List of all associated PCs for each of the eQTLs and genes. -
qobj
: The output from applying the qvalue function.
Author(s)
Md Bahadur Badsha (mbbadshar@gmail.com)
References
1. Badsha MB and Fu AQ (2019). Learning causal biological networks with the principle of Mendelian randomization. Frontiers in Genetics, 10:460.
2. Badsha MB, Martin EA and Fu AQ (2021). MRPC: An R package for inference of causal graphs. Frontiers in Genetics, 10:651812.
See Also
Examples
## Not run:
# Load genomewide gene expression data in GEUVADIS
# 373 individuals
# 23722 genes
data_githubURL <- "https://github.com/audreyqyfu/mrpc_data/raw/master/data_GEUVADIS_allgenes.RData"
load(url(data_githubURL))
PCs <- prcomp(data_GEUVADIS_allgenes,scale=TRUE)
# Extract the PCs matrix
PCs.matrix <- PCs$x
# The eQTL-gene set contains eQTL rs7124238 and genes SBF1-AS1 and SWAP70
data_GEU_Q50 <- data_GEUVADIS$Data_Q50$Data_EUR
colnames(data_GEU_Q50) <- c("rs7124238","SBF2-AS1","SWAP70")
data <- data_GEU_Q50
# Identify associated PCs for this eQTL-gene set
Output <- IdentifyAssociatedPCs(PCs.matrix,no.PCs=10,data,fdr.level=0.05,corr.threshold=TRUE
,corr.value = 0.3)
# Gene SBF2-AS1 is significantly associated with PC2
# Data with PC2 as a potential confounder
data_withPC <- Output$data.withPC
n <- nrow(data_withPC) # Number of rows
V <- colnames(data_withPC) # Column names
# Calculate Pearson correlation for MRPC analysis
suffStat <- list(C = cor(data_withPC,use = "complete.obs"),
n = n)
# Infer the graph by MRPC
MRPC.fit_FDR<- MRPC(data_withPC,
suffStat,
GV = 1,
FDR = 0.05,
indepTest = 'gaussCItest',
labels = V,
FDRcontrol = 'LOND',
verbose = TRUE)
plot(MRPC.fit_FDR, main="MRPC with PCs (potential confounders)")
## End(Not run)