PCADSC {PCADSC} | R Documentation |
Compute the elements used for PCADSC
Description
Principal Component Analysis-based Data Structure Comparison tools that
prepare a dataset for various diagnostic plots for comparing data structures. More
specifically, PCADSC
performs PCA on two subsets of a dataset in order to
compare the structures of these datasets, e.g. to assess whether they can be analyzed pooled
or not. The results of the PCAs are then manipulated in various
ways and stored for easy plotting using the three PCADSC plotting tools, the CEPlot
,
the anglePlot
and the chromaPlot
.
Usage
PCADSC(data, splitBy, vars = NULL, doCE = TRUE, doAngle = TRUE,
doChroma = TRUE, B = 10000)
Arguments
data |
A dataset, either a |
splitBy |
The name of a grouping variable with two levels defining the two groups within the dataset whose data structures we wish to compare. |
vars |
The variable names in |
doCE |
Logical. Should the cumulative eigenvalue plot information be computed? |
doAngle |
Logical. Should the angle plot information be computed? |
doChroma |
Logical. Should the chroma plot information be computed? |
B |
A positive integer. The number of resampling steps performed in the cumulative eigenvalue step, if relevant. |
Details
PCADSC presents a suite of non-parametric, visual tools for comparing the strucutures of
two subsets of a dataset. These tools are all based on PCA (principal component analysis) and
thus they can be interpreted as comparisons of the covariance matrices of the two (sub)datasets.
PCADSC
performs PCA using singular value decomposition for increased numerical precision.
Before performing PCA on the full dataset and the two subsets, all variables within each such
dataset are standardized.
Value
An object of class PCADSC
, which is a named list with the following entries:
- pcaRes
The results of the PCAs performed on the first subset, the second subset and the full subset and also information about the data splitting.
- CEInfo
The information needed for making a cumulative eigenvalue plot (see
CEPlot
).- angleInfo
The information needed for making an angle plot (see
anglePlot
).- chromaInfo
The information needed for making a chroma plot (see
chromaPlot
).- data
The original (full) dataset.
- splitBy
The name of the variable that splits the dataset in two.
- vars
The names of the variables in the dataset that should be used for PCA.
- B
The number of resamplings performed for the
CEInfo
.
See Also
doCE
, doAngle
, doChroma
,
CEPlot
, anglePlot
, chromaPlot
Examples
#load iris data
data(iris)
#Define grouping variable, grouping the observations by whether their species is
#Setosa or not
iris$group <- "setosa"
iris$group[iris$Species != "setosa"] <- "non-setosa"
iris$Species <- NULL
## Not run:
#Make a full PCADSC object, splitting the data by "group"
irisPCADSC <- PCADSC(iris, "group")
#The three plotting functions can now be called on irisPCADSC:
CEPlot(irisPCADSC)
anglePlot(irisPCADSC)
chromaPlot(irisPCADSC)
#Make a partial PCADSC object with no angle plot information and add
#angle plot information afterwards:
irisPCADSC2 <- PCADSC(iris, "group", doAngle = FALSE)
irisPCADSC2 <- doAngle(irisPCADSC)
## End(Not run)
#Make a partial PCADSC obejct with no plotting (angle/CE/chroma)
#information:
irisPCADSC_minimal <- PCADSC(iris, "group", doAngle = FALSE,
doCE = FALSE, doChroma = FALSE)