R: Integral Transformation Methods of Estimating SDR Subspaces...

itdr {itdr}

R Documentation

Integral Transformation Methods of Estimating SDR Subspaces in Regression.

Description

The “itdr()” function computes a basis for sufficient dimension reduction subspaces in regression.

Usage

itdr(y,x,d,m=50,wx=0.1,wy=1,wh=1.5,space="mean",
xdensity="normal",method="FM",x.scale=TRUE)

Arguments

`y`	The n-dimensional response vector.
`x`	The design matrix of the predictors with dimension n-by-p.
`d`	An integer specifying the dimension of the sufficient dimension reduction subspace.
`m`	An integer specifying the number of omega values to use in invFM method.
`wx`	(default 0.1). Tuning parameter for predictor variables.
`wy`	(default 1). Tuning parameter for response variable.
`wh`	(default 1.5). Bandwidth of the kernel density estimation function.
`space`	(default “mean”). Specifies whether to estimate the central mean subspace (“mean”) or the central subspace (“pdf”).
`xdensity`	(default “normal”). Density function of predictor variables. Options are “normal” for multivariate normal distribution, “elliptic” for elliptical contoured distribution, or “kernel” for unkown distribution estimated using kernel smoothing method.
`method`	(default “FM”). Integral transformation method. “FM” for the Fourier transformation method (Zhu and Zeng 2006), “CM” for convolution transformation method (Zeng and Zhu 2010), “iht” for the iterative Hessian transformation method (Cook and Li 2002), and “invFM” for the Fourier transformation approach for inverse dimension reduction method (Weng and Yin, 2018).
`x.scale`	(default TURE). If TRUE, scale the predictor variables.

Details

Let m(x)=E[y|X=x]. The “itdr()” function computes the integral transformation of the gradient of the mean function m(x), which is defined as

\boldsymbol\psi(\boldsymbol\omega) =\int \frac{\partial}{\partial \textbf{x}}m(\textbf{x}) W(\textbf{x},\boldsymbol\omega)f(\textbf{x})d\textbf{x},

where W(\textbf{x},\boldsymbol\omega) is a non degenerate kernel function and an absolutely integrable function. For Fourier transformation (FM) method and for convolution transformation (CM) method W(\textbf{x},\boldsymbol\omega)=\exp(i\boldsymbol\omega^T\textbf{x}) and W(\textbf{x},\boldsymbol\omega)=H(\textbf{x}-\boldsymbol\omega)=(2\pi\sigma_w^2)^{-p/2}\exp(-(\textbf{x}-\boldsymbol{\omega})^T(\textbf{x}-\boldsymbol\omega)/(2\sigma_w^2)) where is \sigma_w^2 is the turning parameter for predictor variables. The candidate matrix to estimate the central mean subspace (CMS) is

\textbf{M}_{CMS}=\int \boldsymbol\psi(\boldsymbol\omega) \boldsymbol\psi(\boldsymbol\omega)^T K(\boldsymbol\omega)d\boldsymbol\omega,

where K(\boldsymbol{\omega})=(2\pi \sigma_w^2)^{-p/2}\exp{(-||\boldsymbol{\omega}||}/2\sigma_w^2) under “FM”, and K(\boldsymbol{\omega})=1 under “CM”. Here, \sigma_w^2 is a tuning parameter and it refers as "tuning parameter for the predictor variables" and denoted by “wx” in all functions.

Let \{T_v(y)=H(y,v),~ for~~ y,v\in \mathcal{R}\} be the family of transformations for the response variable. That is, v \in \mathcal{R}, the mean response of T_v(y) is m(\boldsymbol{\omega},v)=E[H(y,v)\vert \textbf{X}=\textbf{x}]. Then, integral transformation for the gradient of m(\boldsymbol{\omega},v) is defined as

\boldsymbol{\psi}(\boldsymbol{\omega},v)=\int \frac{\partial}{\partial \textbf{x}}m(\textbf{x},v) W(\textbf{x},\boldsymbol{\omega})f(\textbf{x})d\textbf{x},

where W(\textbf{x},\boldsymbol{\omega}) is define as above. Then, for estimating the central subspace (CS) the candidate matrix is defined as

\textbf{M}_{CS}=\int H(y_1,v)H(y_2,v)dv \int \boldsymbol{\psi}(\boldsymbol{\omega}) \bar{\boldsymbol{\psi}}(\boldsymbol{\omega})^T K(\boldsymbol{\omega})d\boldsymbol{\omega},

where K(\boldsymbol{\omega}) is the same as above, and H(y,v)=(2\pi \sigma_t^2)^{-1/2}\exp(v^2/(2\sigma_t^2)) under “FM”, and H(y,v)=(2\pi \sigma_t^2)^{-1/2}\exp((y-v)^2/(2\sigma_t^2)) under “CM”. Here \sigma_t^2 is a tuning parameter and it refers as the "tuning parameter for the response variable" and is denote by “wy” in all functions.

Remark: There is only one tuning parameter in the candidate matrix for estimating of the CMS, and there are two tuning parameters in the candidate matrix for estimating of the CS.

“invFM” method:

Let (\textbf{y}_i,\textbf{x}_i), i =1,\cdots,n, be a random sample, and assume that the dimension of S_{E(\textbf{Z} | \textbf{Y})} is known to be d. Then, for a random finite sequence of \boldsymbol{\omega}_j \in {R}^p, j=1,\cdots,t, compute \widehat{\boldsymbol{\psi}}(\boldsymbol{\omega}_j) as follows (Weng and Yin, 2018):

\widehat{\boldsymbol{\psi}}(\boldsymbol{\omega}_j)=n^{-1}\sum_{k=1}^n \exp( i \boldsymbol{\omega}_j^T\textbf{y}_k)\widehat{\textbf{Z}}_k, j=1,\cdots,t,

where \widehat{\textbf{Z}}_j=\boldsymbol{\Sigma}_{x}^{-1/2}(\textbf{x}_i-\overline{\textbf{x}}). Now, let \textbf{a}(\boldsymbol{\omega}_j)=Real(\widehat{\boldsymbol{\psi}}(\boldsymbol{\omega}_j)), and \textbf{b}(\boldsymbol{\omega}_j)=Image(\widehat{\boldsymbol{\psi}}(\boldsymbol{\omega}_j)). Then, \widehat{\boldsymbol{\Psi}}= (\textbf{a}(\boldsymbol{\omega}_1),\textbf{b}(\boldsymbol{\omega}_1),\cdots,\textbf{a}(\boldsymbol{\omega}_t),\textbf{b}(\boldsymbol{\omega}_t)), for some t > 0, and the population kernel matrix is \widehat{\textbf{V}} = \widehat{\boldsymbol{\Psi}}\widehat{\boldsymbol{\Psi}}^T. Finally, use the d-leading eigenvectors of \widehat{\textbf{V}} as an estimate for the central subspace.

Remark: We use w instead of \boldsymbol{\omega}_1,\cdots,\boldsymbol{\omega}_t in the itdr() function.

Value

The outputs are a p-by-d matrix and a p-by-p matrix defined as follows.

`eta_hat`	The estimated p by d matrix, whose coloumns form a basis of the CMS/CS.
`M`	The estimated p by p candidate matrix.
`eigenvalues`	Eigenvalues of `\widehat{\bold{V}}` from the “invFM” method.
`psi`	Estimation for `\widehat{\bold{\Psi}}` from the “invFM” method.

References

Cook R. D. and Li, B., (2002). Dimension Reduction for Conditional Mean in Regression. The Annals of Statistics. 30, 455-474.

Weng J. and Yin X. (2018). Fourier Transform Approach for Inverse Dimension Reduction Method. Journal of Nonparametric Statistics. 30, 4, 1029-0311.

Zeng P. and Zhu Y. (2010). An Integral Transform Method for Estimating the Central Mean and Central Subspaces. Journal of Multivariate Analysis. 101, 1, 271–290.

Zhu Y. and Zeng P. (2006). Fourier Methods for Estimating the Central Subspace and Central Mean Subspace in Regression. Journal of the American Statistical Association. 101, 476, 1638–1651.

Examples

data(automobile)
head(automobile)
automobile.na <- na.omit(automobile)
wx <- .14
wy <- .9
wh <- 1.5
d <- 2
p <- 13
df <- cbind(automobile[, c(26, 10, 11, 12, 13, 14, 17, 19, 20, 21, 22, 23, 24, 25)])
dff <- as.matrix(df)
automobi <- dff[complete.cases(dff), ]
y <- automobi[, 1]
x <- automobi[, c(2:14)]
xt <- scale(x)
fit.F_CMS <- itdr(y, xt, d, wx, wy, wh, space = "pdf", xdensity = "normal", method = "FM")
round(fit.F_CMS$eta_hat, 2)

[Package itdr version 2.0.1 Index]