extractFAScales {protr}R Documentation

Scales-Based Descriptors derived by Factor Analysis

Description

This function calculates scales-based descriptors derived by Factor Analysis (FA). Users can provide customized amino acid property matrices.

Usage

extractFAScales(
  x,
  propmat,
  factors,
  scores = "regression",
  lag,
  scale = TRUE,
  silent = TRUE
)

Arguments

x

A character vector, as the input protein sequence.

propmat

A matrix containing the properties for the amino acids. Each row represent one amino acid type, each column represents one property. Note that the one-letter row names must be provided for we need them to seek the properties for each AA type.

factors

Integer. The number of factors to be fitted. Must be no greater than the number of AA properties provided.

scores

Type of scores to produce. The default is "regression", which gives Thompson's scores, "Bartlett" given Bartlett's weighted least-squares scores.

lag

The lag parameter. Must be less than the amino acids number in the protein sequence.

scale

Logical. Should we auto-scale the property matrix (propmat) before doing Factor Analysis? Default is TRUE.

silent

Logical. Whether we print the SS loadings, proportion of variance and the cumulative proportion of the selected factors or not. Default is TRUE.

Value

A length lag * p^2 named vector, p is the number of scales (factors) selected.

Author(s)

Nan Xiao <https://nanx.me>

References

Atchley, W. R., Zhao, J., Fernandes, A. D., & Druke, T. (2005). Solving the protein sequence metric problem. Proceedings of the National Academy of Sciences of the United States of America, 102(18), 6395-6400.

Examples

x <- readFASTA(system.file("protseq/P00750.fasta", package = "protr"))[[1]]
data(AATopo)
tprops <- AATopo[, c(37:41, 43:47)] # select a set of topological descriptors
fa <- extractFAScales(x, propmat = tprops, factors = 5, lag = 7, silent = FALSE)

[Package protr version 1.7-2 Index]