msc.pca {rKOMICS}R Documentation

Prinicple Component Analysis based on MSC

Description

The msc.pca function allows you to perform Principle Component Analysis (PCA) to summarize the variation of Minicircle Sequence Classes (MSCs) in all samples or in a subset of samples.

Usage

msc.pca(clustmatrix, samples, groups, n = 20, labels = TRUE, title = NULL)

Arguments

clustmatrix

a cluster matrix obtained from the msc.matrix function. The cluster matrix represents the presence or absence of MSCs in each sample, where rows represent MSCs and columns represent samples.

samples

a vector containing the names of the samples. This can include all samples or a subset of samples that you want to analyze.

groups

a vector specifying the groups (e.g., species) to which the samples belong.

n

the number of clusters to select with the highest contribution to PCA. By default, it is set to 20.

labels

a logical parameter indicating whether to use labels on the PCA plot or not. If set to TRUE (default), the plot will display sample labels.

title

the title of the graph. You can provide a title for the PCA plot if desired.

Value

plot

a PCA plot that visualizes the clustering of samples based on the presence/absence of MSCs. The plot helps identify clusters and patterns of similarity or dissimilarity between samples.

eigenvalues

a barplot showing the percentage of explained variances by each principal component. This plot provides insights into the contribution of each principal component to the overall variation in the data.

clustnames

a A list of cluster names with the highest contribution to PCA. This list helps identify the MSC clusters that have the most influence on the PCA results.

Examples

data(matrices)
data(exData)

### run function with all samples
res.pca <- lapply(matrices, function(x) msc.pca(x, samples = exData$samples, 
                  groups = exData$species, n=30, labels=FALSE, title=NULL))

res.pca$id95$eigenvalues
res.pca$id95$plot

### use clusters with highest contribution to visualize in a heatmap
msc.heatmap(matrices[["id95"]][res.pca$id95$clustnames,], samples = exData$samples,
            groups = exData$species)

### run function with a subset of samples
### you will be asked to confirm
table(exData$species)
hybrid <- which(exData$species=="hybrid")
# pca.subset <- msc.pca(clustmatrix = matrices[["id97"]], 
#                       samples = exData$samples[hybrid], 
#                       groups = exData$species[hybrid], labels = TRUE, 
#                       title = "PCA only with hybrids")


[Package rKOMICS version 1.3 Index]