factor_scores {EMMIXmfa}R Documentation

Computes Factor Scores


This function computes factor scores for observations. Using factor scores, we can represent the original data point yjy_j in a q-dimensional reduced space. This is only meaningful in the case of mcfa or mctfa models, as the factor cores for mfa and mtfa are white noise.

The (estimated conditional expectation of) unobservable factors UijU_{ij} given yjy_j and the component membership can be expressed by,

u^ij=EΨ^{Uijyj,zij=1}. \hat{u}_{ij} = E_{\hat{\Psi}}\{U_{ij} \mid y_j, z_{ij} = 1\}.

The estimated mean UijU_{ij} (over the component membership of yjy_j) is give as

u^j=i=1gτi(yj;Ψ^)u^ij, \hat{u}_{j} = \sum_{i=1}^g \tau_i(y_j; \hat{\Psi}) \hat{u}_{ij},

where τi(yj;Ψ^)\tau_i(y_j; \hat{\Psi}) estimated posterior probability of yjy_j belonging to the iith component.

An alternative estimate of uju_j, the posterior expectation of the factor corresponding to the jth observation yjy_j, is defined by replacing τi(yj;Ψ^)\tau_i(y_j;\,\hat{\Psi}) by z^ij\hat{z}_{ij}, where z^ij=1\hat{z}_{ij} = 1, if τ^i(yj;Ψ^)\hat{\tau}_i(y_j; \hat{\Psi}) >= τh^(yj;Ψ^)(h=1,,g;hi)\hat{\tau_h}(y_j; \hat{\Psi}) (h=1,\,\dots,\,g; h \neq i), else z^ij=0\hat{z}_{ij} = 0.

u^jC=i=1gz^iju^ij. \hat{u}_{j}^C = \sum_{i=1}^g \hat{z}_{ij} \hat{u}_{ij}.

For MFA, we have

u^ij=β^iT(yjμ^i), \hat{u}_{ij} = \hat{\beta}_i^T (y_j - \hat{\mu}_i),


u^j=i=1gτi(yj;Ψ^)β^iT(yjμ^i) \hat{u}_{j} = \sum_{i=1}^g \tau_i(y_j; \hat{\Psi}) \hat{\beta}_i^T (y_j - \hat{\mu}_i)

for j=1,,nj = 1, \dots, n where β^i=(BiBiT+Di)1Bi\hat{\beta}_i = (B_iB_i^T + D_i)^{-1} B_i.


u^ij=ξ^i+γ^iT(yjA^ξ^i), \hat{u}_{ij} = \hat{\xi}_i + \hat{\gamma}_i^T (y_j -\hat{A}\hat{\xi}_i),

u^j=i=1gτi(yj;Ψ^){ξ^i+γ^iT(yjA^ξ^i)}, \hat{u}_{j} = \sum_{i=1}^g\tau_i(y_j; \hat{\Psi}) \{\hat{\xi}_i + \hat{\gamma}_i^T(y_j -\hat{A}\hat{\xi}_i)\},

where γi=(AΩiA+D)1AΩi\gamma_i = (A \Omega_i A + D)^{-1} A \Omega_i.

With MtFA and MCtFA, the distribution of u^ij\hat{u}_{ij} and of u^j\hat{u}_{j} have the same form as those of MFA and MCFA, respectively.


factor_scores(model, Y, ...)
## S3 method for class 'mcfa'
factor_scores(model, Y, tau = NULL, clust= NULL, ...)
## S3 method for class 'mctfa'
factor_scores(model, Y, tau = NULL, clust= NULL, ...)
## S3 method for class 'emmix'
plot(x, ...)



An object of class mfa, mcfa, mtfa or mctfa.


An object of class mfa, mcfa, mtfa or mctfa.


Data matrix with variables in columns in the same order as used in model estimation.


Optional. Posterior probabilities of belonging to the components in the mixture model. If not provided, they will be computed based on the model parameters.


Optional. Indicators of belonging to the components. If not provided, will be estimated using tau.


Not used.


Factor scores can be used in visualization of the data in the factor space.



Estimated conditional expected component scores of the unobservable factors given the data and the component membership (u^ij\hat{u}_{ij}). Size is n×q×gn \times q \times g, where n is the number of sample, q is the number of factors and g is the number components.


Means of the estimated conditional expected factors scores over estimated posterior distributions (u^j\hat{u}_{j}). Size n×qn \times q.


Alternative estimate of Umean where the posterior probabilities for each sample are replaced by component indicator vectors which contain one in the element corresponding to the highest posterior probability while others zero (u^jC\hat{u}_{j}^C). Size n×qn \times q.


Geoff McLachlan, Suren Rathnayake, Jungsun Baek


McLachlan GJ, Baek J, and Rathnayake SI (2011). Mixtures of factor analyzers for the analysis of high-dimensional data. In Mixture Estimation and Applications, KL Mengersen, CP Robert, and DM Titterington (Eds). Hoboken, New Jersey: Wiley, pp. 171–191.

McLachlan GJ, and Peel D (2000). Finite Mixture Models. New York: Wiley.


# Fit a MCFA model to a subset
samp_size <- dim(iris)[1]
sel_subset <- sample(1 : samp_size, 50)
model <- mcfa(iris[sel_subset, -5], g = 3, q = 2, 
                          nkmeans = 1, nrandom = 0, itmax = 100)

# plot the data points in the factor space

# Allocating new samples to the clusters
Y <- iris[-c(sel_subset), -5]
Y <- as.matrix(Y)
clust <- predict(model, Y)

fa_scores <- factor_scores(model, Y)
# Visualizing new data in factor space
plot_factors(fa_scores, type = "Umean", clust = clust)

[Package EMMIXmfa version 2.0.14 Index]