R: Kernel Density-based EM-type algorithm for Semiparametric...

kdeem.h {MixSemiRob}

R Documentation

Kernel Density-based EM-type algorithm for Semiparametric Mixture Regression with Unspecified Homogenous Error Distributions

Description

‘kdeem.h’ is used for semiparametric mixture regression using a kernel density-based expectation-maximization (EM)-type algorithm with unspecified homogeneous error distributions (Hunter and Young, 2012).

Usage

kdeem.h(x, y, C = 2, ini = NULL, maxiter = 200)

Arguments

`x`	an n by p data matrix where n is the number of observations and p is the number of explanatory variables (including the intercept).
`y`	an n-dimensional vector of response variable.
`C`	number of mixture components. Default is 2.
`ini`	initial values for the parameters. Default is NULL, which obtains the initial values using the `kdeem.lse` function. If specified, it can be a list with the form of `list(beta, prop, tau, pi, h)`, where `beta` is a p by C matrix for regression coefficients of C components, `prop` is an n by C matrix for probabilities of each observation belonging to each component, calculated based on the initial `beta` and `h`, `tau` is a vector of C precision parameters (inverse of standard deviation), `pi` is a vector of C mixing proportions, and `h` is the bandwidth for kernel estimation.
`maxiter`	maximum number of iterations for the algorithm. Default is 200.

Details

'kdeem.h' can be used to estimate parameters in a mixture-of-regressions model with independent identically distributed errors. The model is defined as follows:

f_{Y|\boldsymbol{X}}(y,\boldsymbol{x},\boldsymbol{\theta},g) = \sum_{j=1}^C\pi_jg(y-\boldsymbol{x}^{\top}\boldsymbol{\beta}_j).

Here, \boldsymbol{\theta}=(\pi_1,...,\pi_{C-1},\boldsymbol{\beta}_1^{\top},\cdots,\boldsymbol{\beta}_C^{\top}), and g(\cdot) represents identical unspecified density functions. The bandwidth of the kernel density estimation is calculated adaptively using the bw.SJ function from the ‘stats’ package, which implements the method of Sheather & Jones (1991) for bandwidth selection based on pilot estimation of derivatives.

For the calculation of \beta in the M-step, this function employs the universal optimizer ucminf from the ‘ucminf’ package.

Value

A list containing the following elements:

`posterior`	posterior probabilities of each observation belonging to each component.
`beta`	estimated regression coefficients.
`pi`	estimated mixing proportions.
`h`	bandwidth used for the kernel estimation.

References

Hunter, D. R., & Young, D. S. (2012). Semiparametric mixtures of regressions. Journal of Nonparametric Statistics, 24(1), 19-38.

Ma, Y., Wang, S., Xu, L., & Yao, W. (2021). Semiparametric mixture regression with unspecified error distributions. Test, 30, 429-444.

Examples

# See examples for the `kdeem' function.

[Package MixSemiRob version 1.1.0 Index]