semimrOne {MixSemiRob}R Documentation

Semiparametric Mixture Regression Models with Single-index and One-step Backfitting

Description

Assume that x=(x1,,xn)\boldsymbol{x} = (\boldsymbol{x}_1,\cdots,\boldsymbol{x}_n) is an n by p matrix and Y=(Y1,,Yn)Y = (Y_1,\cdots,Y_n) is an n-dimensional vector of response variable. The conditional distribution of YY given x\boldsymbol{x} can be written as:

f(yx,α,π,m,σ2)=j=1Cπj(αx)ϕ(ymj(αx),σj2(αx)).f(y|\boldsymbol{x},\boldsymbol{\alpha},\pi,m,\sigma^2) = \sum_{j=1}^C\pi_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}) \phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x})).

‘semimrFull’ is used to estimate the mixture of single-index models described above, where ϕ(ymj(αx),σj2(αx))\phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x})) represents the normal density with a mean of mj(αx)m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}) and a variance of σj2(αx)\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x}), and πj(),μj(),σj2()\pi_j(\cdot), \mu_j(\cdot), \sigma_j^2(\cdot) are unknown smoothing single-index functions capable of handling high-dimensional non-parametric problem. This function employs kernel regression and a one-step estimation procedure (Xiang and Yao, 2020).

Usage

semimrOne(x, y, h, coef = NULL, ini = NULL, grid = NULL)

Arguments

x

an n by p matrix of observations where n is the number of observations and p is the number of explanatory variables.

y

a vector of response values.

h

bandwidth for the kernel regression. Default is NULL, and the bandwidth is computed in the function by cross-validation.

coef

initial value of α\boldsymbol{\alpha}^{\top} in the model, which plays a role of regression coefficient in a regression model. Default is NULL, and the value is computed in the function by sliced inverse regression (Li, 1991).

ini

initial values for the parameters. Default is NULL, which obtains the initial values, assuming a linear mixture model. If specified, it can be a list with the form of list(pi, mu, var), where pi is a vector of mixing proportions, mu is a vector of component means, and var is a vector of component variances.

grid

grid points at which nonparametric functions are estimated. Default is NULL, which uses the estimated mixing proportions, component means, and component variances as the grid points after the algorithm converges.

Value

A list containing the following elements:

pi

estimated mixing proportions.

mu

estimated component means.

var

estimated component variances.

coef

estimated regression coefficients.

References

Xiang, S. and Yao, W. (2020). Semiparametric mixtures of regressions with single-index for model based clustering. Advances in Data Analysis and Classification, 14(2), 261-292.

Li, K. C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86(414), 316-327.

See Also

semimrFull, sinvreg for initial value calculation of α\boldsymbol{\alpha}^{\top}.

Examples

xx = NBA[, c(1, 2, 4)]
yy = NBA[, 3]
x = xx/t(matrix(rep(sqrt(diag(var(xx))), length(yy)), nrow = 3))
y = yy/sd(yy)
ini_bs = sinvreg(x, y)
ini_b = ini_bs$direction[, 1]

# used a smaller sample for a quicker demonstration of the function
set.seed(123)
est_onestep = semimrOne(x[1:50, ], y[1:50], h = 0.3442, coef = ini_b)

[Package MixSemiRob version 1.1.0 Index]