model_prcomp {pcsstools}R Documentation

Model the principal component score of a set of phenotypes using PCSS

Description

model_prcomp calculates the linear model for the mth principal component score of a set of phenotypes as a function of a set of predictors.

Usage

model_prcomp(
  formula,
  comp = 1,
  n,
  means,
  covs,
  center = FALSE,
  standardize = FALSE,
  ...
)

Arguments

formula

an object of class formula whose dependent variable is a series of variables joined by + operators. model_prcomp will treat a principal component score of those variables as the actual dependent variable. All model terms must be accounted for in means and covs.

comp

integer indicating which principal component score to analyze. Must be less than or equal to the total number of phenotypes.

n

sample size.

means

named vector of predictor and response means.

covs

named matrix of the covariance of all model predictors and the responses.

center

logical. Should the dependent variables be centered before principal components are calculated?

standardize

logical. Should the dependent variables be standardized before principal components are calculated?

...

additional arguments

Value

an object of class "pcsslm".

An object of class "pcsslm" is a list containing at least the following components:

call

the matched call

terms

the terms object used

coefficients

a p x 4 matrix with columns for the estimated coefficient, its standard error, t-statistic and corresponding (two-sided) p-value.

sigma

the square root of the estimated variance of the random error.

df

degrees of freedom, a 3-vector p, n-p, p*, the first being the number of non-aliased coefficients, the last being the total number of coefficients.

fstatistic

a 3-vector with the value of the F-statistic with its numerator and denominator degrees of freedom.

r.squared

R^2, the 'fraction of variance explained by the model'.

adj.r.squared

the above R^2 statistic 'adjusted', penalizing for higher p.

cov.unscaled

a p x p matrix of (unscaled) covariances of the coef[j], j=1,...p.

Sum Sq

a 3-vector with the model's Sum of Squares Regression (SSR), Sum of Squares Error (SSE), and Sum of Squares Total (SST).

References

Wolf JM, Barnard M, Xia X, Ryder N, Westra J, Tintle N (2020). “Computationally efficient, exact, covariate-adjusted genetic principal component analysis by leveraging individual marker summary statistics from large biobanks.” Pacific Symposium on Biocomputing, 25, 719–730. ISSN 2335-6928, doi:10.1142/9789811215636_0063, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6907735/.

Examples

ex_data <- pcsstools_example[c("g1", "x1", "x2", "y1", "y2", "y3")]
head(ex_data)
means <- colMeans(ex_data)
covs <- cov(ex_data)
n <- nrow(ex_data)

model_prcomp(
  y1 + y2 + y3 ~ g1 + x1 + x2,
  comp = 1, n = n, means = means, covs = covs
)

[Package pcsstools version 0.1.2 Index]