predict_lfmm {lfmm} | R Documentation |
Predict polygenic scores from latent factor models
Description
This function computes polygenic risk scores from the estimates of latent factor models. It uses the indirect' effect sizes for the regression of X (a single phenotype) on the matrix Y, for predicting phenotypic values for new genotype data.
Usage
predict_lfmm(Y, X, lfmm.object, fdr.level = 0.1, newdata = NULL)
Arguments
Y |
a response variable matrix with n rows and p columns, typically containing genotypes. Each column is a response variable (numeric). |
X |
an explanatory variable with n rows and d = 1 column (numeric) representing a phenotype with zero mean across the sample. |
lfmm.object |
an object of class |
fdr.level |
a numeric value for the FDR level in the lfmm test used to define candidate variables for predicting new phenotypes. |
newdata |
a matrix with n rows and p' columns, and similar to Y, on which predictions of X will be based. If NULL, Y is used as new data. |
Details
The response variable matrix Y and the explanatory variable are centered.
Value
a list with the following attributes:
prediction: a vector of length n containing the predicted values for X. If newdata = NULL, the fitted values are returned.
candidates: a vector of candidate columns of Y on which the predictions are built.
Author(s)
Kevin Caye, Basile Jumentier, Olivier Francois
Examples
library(lfmm)
## Simulation of 1000 genotypes for 100 individuals (y)
u <- matrix(rnorm(300, sd = 1), nrow = 100, ncol = 3)
v <- matrix(rnorm(3000, sd = 3), nrow = 3, ncol = 1000)
w <- u %*% v
y <- matrix(rbinom(100000, size = 2,
prob = 1/(1 + exp(-0.3 * (w
+ rnorm(100000, sd = 2))))),
nrow = 100,
ncol = 1000)
#PCA of genotypes, 2 main axes of variation (K = 2)
plot(prcomp(y))
## Simulation of 1000 phenotypes (x)
## Only the last 10 genotypes have significant effect sizes (b)
b <- matrix(c(rep(0, 990), rep(6000, 10)))
x <- y%*%b + rnorm(100, sd = 100)
## Compute effect sizes using lfmm_ridge
mod <- lfmm_ridge(Y = y,
X = x,
K = 2)
x.pred <- predict_lfmm(Y = y,
X = x,
fdr.level = 0.25,
mod)
x.pred$candidates
##Compare simulated and predicted/fitted phenotypes
plot(x - mean(x), x.pred$pred,
pch = 19, col = "grey",
xlab = "Observed phenotypes (centered)",
ylab = "Predicted from PRS")
abline(0,1)
abline(lm(x.pred$pred ~ scale(x, scale = FALSE)), col = 2)