rrmix {rrMixture} | R Documentation |
Reduced-Rank Mixture Models in Multivariate Regression
Description
‘rrmix’ is used to estimate parameters of reduced-rank mixture models in multivariate linear regression using the full-ranked, rank-penalized, and adaptive nuclear norm penalized estimators proposed by Kang et. al. (2022+).
Usage
rrmix(K = 2, X, Y, est = c("FR", "RP", "ANNP"),
lambda = 0, gamma = 2, ind0 = NULL, para0 = NULL, seed = NULL,
kmscale = FALSE, km.nstart = 20, n.init = 100, commonvar = FALSE,
maxiter = 1000, maxiter.int = 100, thres = 1e-05, thres.int = 1e-05,
visible = FALSE, para.true = NULL, ind.true = NULL)
Arguments
K |
number of mixture components. |
X |
n by p design matrix where n is the number of observations and p is the number of predictors. |
Y |
n by q response matrix where n is the number of observations and q is the number of responses. |
est |
character, specifying the estimation method. ‘FR’, ‘RP’, and ‘ANNP’ refers to as the full-ranked, rank-penalized, and adaptive nuclear norm penalized method, respectively. |
lambda |
numerical value, specifying tuning parameter. Only used in the estimation method of ‘RP’ and ‘ANNP’. If 0, all estimation methods (‘FR’, ‘RP’, and ‘ANNP’) provide the same estimation results. |
gamma |
numerical value, specifying additional tuning parameter, only used in the estimation method of ‘ANNP’. It must be nonnegative. |
ind0 |
vector of length n, specifying the initial assignment of the mixture membership of n observations when there is prior information on the membership. If ‘NULL’, K-means clustering technique is used to assign the membership for n observations. Default is ‘NULL’. |
para0 |
array of length K. It consists of K lists, each of which contains initial values of membership probability, coefficient matrix, and variance- covariance matrix. |
seed |
seed number for the reproducibility of initialization results in the EM algorithm. Default is ‘NULL’. |
kmscale |
logical value, indicating whether Y is scaled prior to K-means clustering for initialization. Default is ‘FALSE’. |
km.nstart |
number of random sets considered to perform K-means clustering for initialization. Default is 20. |
n.init |
number of initializations to try. Two methods for initial clustering are used: K-means and random clustering. |
commonvar |
logical value, indicating the homogeneity assumption of variance-covariance matrices across K mixture components. Default is ‘FALSE’. |
maxiter |
maximum number of iterations for external iterative algorithm, used in all estimation methods. |
maxiter.int |
maximum number of iterations for internal iterative algorithm, only used in the estimation method of ‘ANNP’. |
thres |
threshold value for external EM algorithm, used in all estimation methods. It controls the termination of the EM algorithm. |
thres.int |
threshold value for internal iterative algorithm, only used in the estimation method of ‘ANNP’. It controls the termination of the internal algorithm. |
visible |
logical value, indicating whether the outputs from each iteration are printed. Useful when the whole algorithm takes long. Default is ‘FALSE’. |
para.true |
array of length K. It consists of K lists, each of which contains a coefficient matrix and its true rank. Only used when true models are known, e.g., in a simulation study. |
ind.true |
vector of length n, specifying the true mixture membership for n observations. Only used when true models are known, e.g., in a simulation study. |
Value
An object of class rrmix
containing the fitted model, including:
call |
original function call. |
seed |
seed number which is set for the initilization. |
n.est |
vector of length K, specifying the estimated number of observations in each mixture components. |
para |
array of length K. It consists of K lists, each of which contains final estimates of membership probability, coefficient matrix, and variance- covariance matrix. |
est.rank |
vector of length K, specifying the estimated ranks of coefficient matrices. |
npar |
number of parameters in the model, used to estimate the BIC. |
n.iter |
number of iterations (external EM algorithm). |
lambda |
tuning parameter for the estimation method of 'RP' or 'ANNP'. |
gamma |
tuning parameter for the estimation method of 'ANNP'. |
ind |
vector of length n, specifying the estimated mixture membership for n observations. |
ind.true |
vector of length n, specifying the true mixture membership for n observations. Only returned when the true models are known. |
loglik |
log-likelihood of the final model. |
penloglik |
penalized log-likelihood of the final model. |
penalty |
penalty in the penalized log-likelihood of the final model. |
bic |
BIC of the final model. |
avg.nn.iter |
average number of iterations for internal iterative algorithm, only returned for the estimation method of 'ANNP'. |
resmat |
matrix containing the information for each iteration of the EM algorithm, e.g., iteration number, log-likelihood, penalized log- likelihood, difference between penalized log-likelihood values from two consecutive iterations, and computing time. |
class.err |
Soft and hard classification errors for mixture membership. Only returned when the true models are known. |
est.err |
estimation error from the comparison between the estimated and true coefficient matrices. Only returned when the true models are known. |
pred.err |
prediction error. Only returned when the true models are known. |
Author(s)
Suyeon Kang, University of California, Riverside, skang062@ucr.edu; Weixin Yao, University of California, Riverside, weixin.yao@ucr.edu; Kun Chen, University of Connecticut, kun.chen@uconn.edu.
References
Kang, S., Chen, K., and Yao, W. (2022+). "Reduced rank estimation in mixtures of multivariate linear regression".
See Also
rrmix.sim.norm
, initialize.para
Examples
library(rrMixture)
#-----------------------------------------------------------#
# Real Data Example: Tuna Data
#-----------------------------------------------------------#
require(bayesm)
data(tuna)
tunaY <- log(tuna[, c("MOVE1", "MOVE2", "MOVE3", "MOVE4",
"MOVE5", "MOVE6", "MOVE7")])
tunaX <- tuna[, c("NSALE1", "NSALE2", "NSALE3", "NSALE4",
"NSALE5", "NSALE6", "NSALE7",
"LPRICE1", "LPRICE2", "LPRICE3", "LPRICE4",
"LPRICE5", "LPRICE6", "LPRICE7")]
tunaX <- cbind(intercept = 1, tunaX)
# Rank-penalized estimation
tuna.rp <- rrmix(K = 2, X = tunaX, Y = tunaY, lambda = 3, est = "RP",
seed = 100, n.init = 100)
summary(tuna.rp)
plot(tuna.rp)
# Adaptive nuclear norm penalized estimation
tuna.annp <- rrmix(K = 2, X = tunaX, Y = tunaY, lambda = 3, gamma = 2, est = "ANNP",
seed = 100, n.init = 100)
summary(tuna.annp)
plot(tuna.annp)
#-----------------------------------------------------------#
# Simulation: Two Components Case
#-----------------------------------------------------------#
# Simulation Data
K2mod <- rrmix.sim.norm(K = 2, n = 100, p = 5, q = 5, rho = .5,
b = 1, shift = 1, r.star = c(1, 3), sigma = c(1, 1),
pr = c(.5, .5), seed = 1215)
# Rank-penalized estimation
K2.rp <- rrmix(K = 2, X = K2mod$X, Y = K2mod$Y, lambda = 1,
seed = 17, est = "RP", ind.true = K2mod$ind.true,
para.true = K2mod$para.true, n.init = 100)
summary(K2.rp)
plot(K2.rp)
# Adaptive nuclear norm penalized estimation
K2.annp <- rrmix(K = 2, X = K2mod$X, Y = K2mod$Y, lambda = 1,
seed = 17, est = "ANNP", ind.true = K2mod$ind.true,
para.true = K2mod$para.true, n.init = 100)
summary(K2.annp)
plot(K2.annp)