mixregRM2 {MixSemiRob} | R Documentation |
Robust Mixture Regression with Thresholding-Embedded EM Algorithm for Penalized Estimation
Description
A robust mixture regression model that simultaneously conducts outlier detection and robust parameter estimation. It uses a sparse, case-specific, and scale-dependent mean-shift mixture model parameterization (Yu et al., 2017):
, where
is the number of components in the model,
is the parameter to estimate,
and
is a vector of mean-shift parameter for the ith observation.
Usage
mixregRM2(x, y, C = 2, ini = NULL, nstart = 20, tol = 1e-02, maxiter = 50,
method = c("HARD", "SOFT"), sigma.const = 0.001, lambda = 0.001)
Arguments
x |
an n by p data matrix where n is the number of observations and p is the number of explanatory variables. The intercept term will automatically be added to the data. |
y |
an n-dimensional vector of response variable. |
C |
number of mixture components. Default is 2. |
ini |
initial values for the parameters. Default is NULL, which obtains the initial values
using the |
nstart |
number of initializations to try. Default is 20. |
tol |
stopping criteria (threshold value) for the EM algorithm. Default is 1e-02. |
maxiter |
maximum number of iterations for the EM algorithm. Default is 50. |
method |
character, determining which threshold method to use: |
sigma.const |
constraint on the ratio of minimum and maximum values of sigma. Default is 0.001. |
lambda |
tuning parameter in the penalty term. It can be found based on BIC. See Yu et al. (2017) for more details. |
Details
The parameters are estimated by maximizing the corresponding penalized log-likelihood function using an EM algorithm.
The thresholding rule involes the estimation of corresponding to different penalty:
Soft threshold:
, corresponding to the
penalty.
Hard threshold:
, corresponding to the
penalty.
Here, and
. Also,
is taken as
for soft threshold and
for hard threshold.
Value
A list containing the following elements:
pi |
C-dimensional vector of estimated mixing proportions. |
beta |
C by (p + 1) matrix of estimated regression coefficients. |
sigma |
C-dimensional vector of estimated standard deviations. |
gamma |
n-dimensional vector of estimated mean shift values. |
posterior |
n by C matrix of posterior probabilities of each observation belonging to each component. |
run |
total number of iterations after convergence. |
References
Yu, C., Yao, W., and Chen, K. (2017). A new method for robust mixture regression. Canadian Journal of Statistics, 45(1), 77-94.
See Also
mixreg
for initial value calculation.
Examples
data(tone)
y = tone$tuned
x = tone$stretchratio
k = 160
x[151:k] = 0
y[151:k] = 5
est_RM2 = mixregRM2(x, y, lambda = 1)