model_combo {pcsstools}R Documentation

Model a linear combination of a set of phenotypes using PCSS

Description

model_combo calculates the linear model for a linear combination of phenotypes as a function of a set of predictors.

Usage

model_combo(formula, phi, n, means, covs, ...)

Arguments

formula

an object of class formula whose dependent variable is a series of variables joined by + operators. model_combo will treat a principal component score of those variables as the actual dependent variable. All model terms must be accounted for in means and covs.

phi

named vector of linear weights for each variable in the dependent variable in formula.

n

sample size.

means

named vector of predictor and response means.

covs

named matrix of the covariance of all model predictors and the responses.

...

additional arguments

Value

an object of class "pcsslm".

An object of class "pcsslm" is a list containing at least the following components:

call

the matched call

terms

the terms object used

coefficients

a p x 4 matrix with columns for the estimated coefficient, its standard error, t-statistic and corresponding (two-sided) p-value.

sigma

the square root of the estimated variance of the random error.

df

degrees of freedom, a 3-vector p, n-p, p*, the first being the number of non-aliased coefficients, the last being the total number of coefficients.

fstatistic

a 3-vector with the value of the F-statistic with its numerator and denominator degrees of freedom.

r.squared

R^2, the 'fraction of variance explained by the model'.

adj.r.squared

the above R^2 statistic 'adjusted', penalizing for higher p.

cov.unscaled

a p x p matrix of (unscaled) covariances of the coef[j], j=1,...p.

Sum Sq

a 3-vector with the model's Sum of Squares Regression (SSR), Sum of Squares Error (SSE), and Sum of Squares Total (SST).

References

Wolf JM, Barnard M, Xia X, Ryder N, Westra J, Tintle N (2020). “Computationally efficient, exact, covariate-adjusted genetic principal component analysis by leveraging individual marker summary statistics from large biobanks.” Pacific Symposium on Biocomputing, 25, 719–730. ISSN 2335-6928, doi:10.1142/9789811215636_0063, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6907735/.

Gasdaska A, Friend D, Chen R, Westra J, Zawistowski M, Lindsey W, Tintle N (2019). “Leveraging summary statistics to make inferences about complex phenotypes in large biobanks.” Pacific Symposium on Biocomputing, 24, 391–402. ISSN 2335-6928, doi:10.1142/9789813279827_0036, https://pubmed.ncbi.nlm.nih.gov/30963077/.

Examples

ex_data <- pcsstools_example[c("g1", "x1", "x2", "x3", "y1", "y2", "y3")]
head(ex_data)
means <- colMeans(ex_data)
covs <- cov(ex_data)
n <- nrow(ex_data)
phi <- c("y1" = 1, "y2" = -1, "y3" = 0.5)

model_combo(
  y1 + y2 + y3 ~ g1 + x1 + x2 + x3, 
  phi = phi, n = n, means = means, covs = covs
)

summary(lm(y1 - y2 + 0.5 * y3 ~ g1 + x1 + x2 + x3, data = ex_data))

[Package pcsstools version 0.1.2 Index]