snp.pca {ASRgenomics}R Documentation

Performs a Principal Component Analysis (PCA) based on a molecular matrix M

Description

Generates a PCA and summary statistics from a given molecular matrix for population structure. Matrix provided is of full form (n \times p), with n individuals and p markers. Individual and marker names are assigned to rownames and colnames, respectively. SNP data is coded as 0, 1, 2 (integers or decimal numbers). Missing values are not accepted and these need to be imputed (see function qc.filtering() for implementing mean imputation). There is additional output such as plots and other data frames to be used on other downstream analyses (such as GWAS).

Usage

snp.pca(M = NULL, label = FALSE, ncp = 10, groups = NULL, ellipses = FALSE)

Arguments

M

A matrix with SNP data of full form (n \times p), with n individuals and p markers (default = NULL).

label

If TRUE then includes in output individuals names (default = FALSE).

ncp

The number of PC dimensions to be shown in the screeplot, and to provide in the output data frame (default = 10).

groups

Specifies a vector of class factor that will be used to define different colors for individuals in the PCA plot. It must be presented in the same order as the individuals in the molecular \boldsymbol{M} matrix (default = NULL).

ellipses

If TRUE, ellipses will will be drawn around each of the define levels in groups (default = FALSE).

Details

It calls function prcomp() to generate the PCA and the factoextra R package to extract and visualize results. Methodology uses normalized allele frequencies as proposed by Patterson et al. (2006).

Value

A list with the following four elements:

References

Patterson N., Price A.L., and Reich, D. 2006. Population structure and eigenanalysis. PLoS Genet 2(12):e190. doi:10.1371/journal.pgen.0020190

Examples

# Perform the PCA.
SNP_pca <- snp.pca(M = geno.apple, ncp = 10)
ls(SNP_pca)
SNP_pca$eigenvalues
head(SNP_pca$pca.scores)
SNP_pca$plot.pca
SNP_pca$plot.scree

# PCA plot by family (17 groups).
grp <- as.factor(pheno.apple$Family)
SNP_pca_grp <- snp.pca(M = geno.apple, groups = grp, label = FALSE)
SNP_pca_grp$plot.pca


[Package ASRgenomics version 1.1.4 Index]