factor_scores {EMMIXmfa} | R Documentation |
Computes Factor Scores
Description
This function computes factor scores for observations.
Using factor scores,
we can represent the original data point y_j
in a
q-dimensional reduced space. This is only meaningful
in the case of mcfa
or mctfa
models,
as the factor cores for mfa
and mtfa
are
white noise.
The (estimated conditional expectation of) unobservable factors
U_{ij}
given y_j
and the component membership
can be expressed by,
\hat{u}_{ij} = E_{\hat{\Psi}}\{U_{ij} \mid y_j, z_{ij} = 1\}.
The estimated mean U_{ij}
(over the
component membership of y_j
)
is give as
\hat{u}_{j} = \sum_{i=1}^g \tau_i(y_j; \hat{\Psi}) \hat{u}_{ij},
where \tau_i(y_j; \hat{\Psi})
estimated posterior probability of y_j
belonging to the i
th component.
An alternative estimate of u_j
, the posterior expectation
of the factor corresponding to the jth observation y_j
, is
defined by replacing \tau_i(y_j;\,\hat{\Psi})
by \hat{z}_{ij}
,
where
\hat{z}_{ij} = 1
, if \hat{\tau}_i(y_j; \hat{\Psi})
>= \hat{\tau_h}(y_j; \hat{\Psi})
(h=1,\,\dots,\,g; h \neq i)
, else
\hat{z}_{ij} = 0
.
\hat{u}_{j}^C = \sum_{i=1}^g \hat{z}_{ij} \hat{u}_{ij}.
For MFA, we have
\hat{u}_{ij} = \hat{\beta}_i^T (y_j - \hat{\mu}_i),
and
\hat{u}_{j} = \sum_{i=1}^g \tau_i(y_j; \hat{\Psi}) \hat{\beta}_i^T
(y_j - \hat{\mu}_i)
for j = 1, \dots, n
where
\hat{\beta}_i = (B_iB_i^T + D_i)^{-1} B_i
.
For MCFA,
\hat{u}_{ij} = \hat{\xi}_i + \hat{\gamma}_i^T (y_j -\hat{A}\hat{\xi}_i),
\hat{u}_{j} = \sum_{i=1}^g\tau_i(y_j; \hat{\Psi})
\{\hat{\xi}_i + \hat{\gamma}_i^T(y_j -\hat{A}\hat{\xi}_i)\},
where \gamma_i = (A \Omega_i A + D)^{-1} A \Omega_i
.
With MtFA and MCtFA, the distribution of
\hat{u}_{ij}
and of \hat{u}_{j}
have the same form as those of MFA and MCFA, respectively.
Usage
factor_scores(model, Y, ...)
## S3 method for class 'mcfa'
factor_scores(model, Y, tau = NULL, clust= NULL, ...)
## S3 method for class 'mctfa'
factor_scores(model, Y, tau = NULL, clust= NULL, ...)
## S3 method for class 'emmix'
plot(x, ...)
Arguments
model |
An object of class |
x |
An object of class |
Y |
Data matrix with variables in columns in the same order as used in model estimation. |
tau |
Optional. Posterior probabilities of belonging to the components
in the mixture model. If not provided, they will be computed based on
the |
clust |
Optional. Indicators of belonging to the components.
If not provided, will be estimated using |
... |
Not used. |
Details
Factor scores can be used in visualization of the data in the factor space.
Value
Uscores |
Estimated conditional expected component scores of the
unobservable factors given the data and the component membership
( |
Umean |
Means of the estimated conditional expected factors scores over
estimated posterior distributions ( |
Uclust |
Alternative estimate of |
Author(s)
Geoff McLachlan, Suren Rathnayake, Jungsun Baek
References
McLachlan GJ, Baek J, and Rathnayake SI (2011). Mixtures of factor analyzers for the analysis of high-dimensional data. In Mixture Estimation and Applications, KL Mengersen, CP Robert, and DM Titterington (Eds). Hoboken, New Jersey: Wiley, pp. 171–191.
McLachlan GJ, and Peel D (2000). Finite Mixture Models. New York: Wiley.
Examples
# Fit a MCFA model to a subset
set.seed(1)
samp_size <- dim(iris)[1]
sel_subset <- sample(1 : samp_size, 50)
model <- mcfa(iris[sel_subset, -5], g = 3, q = 2,
nkmeans = 1, nrandom = 0, itmax = 100)
# plot the data points in the factor space
plot(model)
# Allocating new samples to the clusters
Y <- iris[-c(sel_subset), -5]
Y <- as.matrix(Y)
clust <- predict(model, Y)
fa_scores <- factor_scores(model, Y)
# Visualizing new data in factor space
plot_factors(fa_scores, type = "Umean", clust = clust)