tobit2 {cenGAM} | R Documentation |
Tobit II family for censored GAM
Description
This function implements the Tobit II family for the mgcv package.
Usage
tobit2(link=list("identity","identity","log","logit2" ),
censoring = FALSE, rho=NULL, eps = 1e-3)
Arguments
link |
The link functions: Corresponds to |
censoring |
Vector of TRUE/FALSE values to denote censorship. TRUE values are censored |
rho |
Value of rho. If NULL, is estimated. |
eps |
Parameter to perturb rho in estimation if very close to -1 or 1. |
Details
Under the Tobit II model, given a value sigma and a conditional mean of mu1, and a censoring parameter mu2, response values are censored if mu2 + epsilon2 < 0, and mu1 + sigma*epsilon1 otherwise.
Here epsilon1, epsilon2 are distributed Normal(0, 1) with correlated rho.
This function allows a non-linear relationship be estimated between mu1, mu2, sigma, rho and the covariates in a restricted maximum likelihood approach, via application of Wood (2016). Note that this allows for heteroskedastic errors.
Estimation of rho depends on the distributional qualities near the censorship boundary, and is hence typically very inaccurate for typical sample sizes. Hence in practice it is often better to supply a value of rho (for example 0 to imply independent censorship) instead. eps is used when estimating rho to avoid errors when rho is close to 1 or -1. Smaller values may produce more accurate results.
This method is still currently very *experimental*. It's not suggested to be used to important applications. Errors can occur if the default starting point for the function cause problems, consider changing the start
argument to gam
.
Value
An object inheriting from class family
for use with the mgcv package.
References
Wood, S.N., N. Pya and B. Saefken (2016), Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association. <URL: http://arxiv.org/abs/1511.03864>
See Also
Examples
# Generate a small example
set.seed(1)
x <- matrix(2*rnorm(400), 200)
yn <- x[,1]^2 + x[,2]
y <- yn + rnorm(200)
censored <- (rnorm(200) + 2*x[,2]+1) < 0 #censored according to x[,2]
ycensored <- replace(y, censored, 0)
m <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) , ~x[,1]+x[,2], ~1,~1),
family = tobit2(censoring = censored)) #estimated rho
par(mfrow = c(3,2))
plot(gam(y ~ s(x[,1]) + s(x[,2]) ), ylim=c(-5, 5), main = "True")
plot(m, ylim = c(-5, 5), main = "Tobit II estimated rho")
summary(m)
m$fitted #gives for each observation fitted mu1, mu2, sigma, rho
m2 <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) , ~x[,1]+x[,2], ~1),
family = tobit2(censoring = censored, rho=0)) #non estimated rho
plot(m2, ylim = c(-5, 5), main = "Tobit II fixed rho")
## Not run:
#Larger example
set.seed(1)
x <- matrix(2*rnorm(1500), 500)
yn <- 2*x[,3] + 4*cos(x[,1]*2)
y <- yn + 3*rnorm(500)
censored <- (rnorm(500) + 2*x[,2]) < 0 #censored according to x[,2]
ycensored <- replace(y, censored, 0)
par(mfrow = c(3,3))
# True model
plot(gam(y ~ s(x[,1]) + s(x[,2]) + s(x[, 3])), ylim=c(-5, 5), main = "True")
# Naive estimation
plot(gam(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3])), ylim=c(-5, 5), main = "Naive")
# Tobit II estimation
m <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3]), ~x[,1]+x[,2]+x[,3], ~1,~1),
family = tobit2(censoring = censored))
plot(m, ylim = c(-5, 5), main = "Tobit II")
#fitting with non-estimated rho
m2 <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3]), ~x[,1]+x[,2]+x[,3],~1),
family = tobit2(censoring = censored, rho=0))
## End(Not run)