R: Semiparametric Mixture Regression Models with Single-index...

semimrFull {MixSemiRob}

R Documentation

Semiparametric Mixture Regression Models with Single-index Proportion and Fully Iterative Backfitting

Description

Assume that \boldsymbol{x} = (\boldsymbol{x}_1,\cdots,\boldsymbol{x}_n) is an n by p matrix and Y = (Y_1,\cdots,Y_n) is an n-dimensional vector of response variable. The conditional distribution of Y given \boldsymbol{x} can be written as:

f(y|\boldsymbol{x},\boldsymbol{\alpha},\pi,m,\sigma^2) = \sum_{j=1}^C\pi_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}) \phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x})).

‘semimrFull’ is used to estimate the mixture of single-index models described above, where \phi(y|m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}),\sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x})) represents the normal density with a mean of m_j(\boldsymbol{\alpha}^{\top}\boldsymbol{x}) and a variance of \sigma_j^2(\boldsymbol{\alpha}^{\top}\boldsymbol{x}), and \pi_j(\cdot), \mu_j(\cdot), \sigma_j^2(\cdot) are unknown smoothing single-index functions capable of handling high-dimensional non-parametric problem. This function employs kernel regression and a fully iterative backfitting (FIB) estimation procedure (Xiang and Yao, 2020).

Usage

semimrFull(x, y, h = NULL, coef = NULL, ini = NULL, grid = NULL, maxiter = 100)

Arguments

`x`	an n by p matrix of observations where n is the number of observations and p is the number of explanatory variables.
`y`	an n-dimensional vector of response values.
`h`	bandwidth for the kernel regression. Default is NULL, and the bandwidth is computed in the function by cross-validation.
`coef`	initial value of `\boldsymbol{\alpha}^{\top}` in the model, which plays a role of regression coefficient in a regression model. Default is NULL, and the value is computed in the function by sliced inverse regression (Li, 1991).
`ini`	initial values for the parameters. Default is NULL, which obtains the initial values, assuming a linear mixture model. If specified, it can be a list with the form of `list(pi, mu, var)`, where `pi` is a vector of mixing proportions, `mu` is a vector of component means, and `var` is a vector of component variances.
`grid`	grid points at which nonparametric functions are estimated. Default is NULL, which uses the estimated mixing proportions, component means, and component variances as the grid points after the algorithm converges.
`maxiter`	maximum number of iterations. Default is 100.

Value

A list containing the following elements:

`pi`	matrix of estimated mixing proportions.
`mu`	estimated component means.
`var`	estimated component variances.
`coef`	estimated regression coefficients.
`run`	total number of iterations after convergence.

References

Xiang, S. and Yao, W. (2020). Semiparametric mixtures of regressions with single-index for model based clustering. Advances in Data Analysis and Classification, 14(2), 261-292.

Li, K. C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86(414), 316-327.

Examples

xx = NBA[, c(1, 2, 4)]
yy = NBA[, 3]
x = xx/t(matrix(rep(sqrt(diag(var(xx))), length(yy)), nrow = 3))
y = yy/sd(yy)
ini_bs = sinvreg(x, y)
ini_b = ini_bs$direction[, 1]
est = semimrFull(x[1:50, ], y[1:50], h = 0.3442, coef = ini_b)

[Package MixSemiRob version 1.1.0 Index]