R: Semiparametric estimates for the unknown components of the...

plrm.est {PLRModels}

R Documentation

Semiparametric estimates for the unknown components of the regression function in PLR models

Description

This routine computes estimates for \beta and m(newt_j) (j=1,...,J) from a sample {(Y_i, X_{i1}, ..., X_{ip}, t_i)}: i=1,...,n, where:

\beta = (\beta_1,...,\beta_p)

is an unknown vector parameter,

m(.)

is a smooth but unknown function and

Y_i= X_{i1}*\beta_1 +...+ X_{ip}*\beta_p + m(t_i) + \epsilon_i.

The random errors, \epsilon_i, are allowed to be time series. Kernel smoothing, combined with ordinary least squares estimation, is used.

Usage

plrm.est(data = data, b = NULL, h = NULL, newt = NULL, estimator = "NW", 
kernel = "quadratic")

Arguments

`data`	`data[, 1]` contains the values of the response variable, `Y`; `data[, 2:(p+1)]` contains the values of the "linear" explanatory variables, `X_1, ..., X_p`; `data[, p+2]` contains the values of the "nonparametric" explanatory variable, `t`.
`b`	bandwidth for estimating the parametric part of the model. If both `b` and `h` are `NULL` (the default), it is selected by means of the cross-validation procedure (fixing `b=h`); if `b` is `NULL` (the default) but `h` is not `NULL`, `b=h` is considered.
`h`	`(b,h)` is the pair of bandwidths for estimating the nonparametric part of the model. If both `b` and `h` are `NULL` (the default), it is selected by means of the cross-validation procedure (fixing `b=h`); if `b` is `NULL` (the default) but `h` is not `NULL`, `b=h` is considered; if `h` is `NULL` (the default) but `b` is not `NULL`, `h=b` is considered.
`newt`	values of the "nonparametric" explanatory variable where the estimator of `m` is evaluated. If NULL (the default), the considered values will be the values of `data[,p+2]`.
`estimator`	allows us the choice between “NW” (Nadaraya-Watson) or “LLP” (Local Linear Polynomial). The default is “NW”.
`kernel`	allows us the choice between “gaussian”, “quadratic” (Epanechnikov kernel), “triweight” or “uniform” kernel. The default is “quadratic”.

Details

Expressions for the estimators of \beta and m can be seen in page 52 in Aneiros-Perez et al. (2004).

Value

A list containing:

`beta`	a vector containing the estimate of `\beta`.
`m.t`	a vector containing the estimator of the non-parametric part, `m`, evaluated in the design points.
`m.newt`	a vector containing the estimator of the non-parametric part, `m`, evaluated in `newt`.
`residuals`	a vector containing the residuals: `Y - X*beta - m.t`.
`fitted.values`	the values obtained from the expression: `X*beta + m.t`
`b`	the considered bandwidth for estimating `\beta`.
`h`	`(b,h)` is the pair of bandwidths considered for estimating `m`.

Author(s)

German Aneiros Perez ganeiros@udc.es

Ana Lopez Cheda ana.lopez.cheda@udc.es

References

Aneiros-Perez, G., Gonzalez-Manteiga, W. and Vieu, P. (2004) Estimation and testing in a partial linear regression under long-memory dependence. Bernoulli 10, 49-78.

Hardle, W., Liang, H. and Gao, J. (2000) Partially Linear Models. Physica-Verlag.

Speckman, P. (1988) Kernel smoothing in partial linear models. J. R. Statist. Soc. B 50, 413-436.

Examples

# EXAMPLE 1: REAL DATA
data(barnacles1)
data <- as.matrix(barnacles1)
data <- diff(data, 12)
data <- cbind(data,1:nrow(data))

b.h <- plrm.gcv(data)$bh.opt
ajuste <- plrm.est(data=data, b=b.h[1], h=b.h[2])
ajuste$beta
plot(data[,4], ajuste$m, type="l", xlab="t", ylab="m(t)")

plot(data[,1], ajuste$fitted.values, xlab="y", ylab="y.hat", main="y.hat vs y")
abline(0,1)

mean(ajuste$residuals^2)/var(data[,1])



# EXAMPLE 2: SIMULATED DATA
## Example 2a: independent data

set.seed(1234)
# We generate the data
n <- 100
t <- ((1:n)-0.5)/n
beta <- c(0.05, 0.01)
m <- function(t) {0.25*t*(1-t)}
f <- m(t)

x <- matrix(rnorm(200,0,1), nrow=n)
sum <- x%*%beta
epsilon <- rnorm(n, 0, 0.01)
y <-  sum + f + epsilon
data_ind <- matrix(c(y,x,t),nrow=100)

# We estimate the components of the PLR model
# (CV bandwidth)
a <- plrm.est(data_ind)

a$beta

est <- a$m.t
plot(t, est, type="l", lty=2, ylab="")
points(t, 0.25*t*(1-t), type="l")
legend(x="topleft", legend = c("m", "m hat"), col=c("black", "black"), lty=c(1,2))


## Example 2b: dependent data
# We generate the data
x <- matrix(rnorm(200,0,1), nrow=n)
sum <- x%*%beta
epsilon <- arima.sim(list(order = c(1,0,0), ar=0.7), sd = 0.01, n = n)
y <-  sum + f + epsilon
data_dep <- matrix(c(y,x,t),nrow=100)

# We estimate the components of the PLR model
# (CV bandwidth)
h <- plrm.cv(data_dep, ln.0=2)$bh.opt[3,1]
a <- plrm.est(data_dep, h=h)

a$beta

est <- a$m.t
plot(t, est, type="l", lty=2, ylab="")
points(t, 0.25*t*(1-t), type="l")
legend(x="topleft", legend = c("m", "m hat"), col=c("black", "black"), lty=c(1,2))

[Package PLRModels version 1.4 Index]