R: Tobit II family for censored GAM

tobit2 {cenGAM}

R Documentation

Tobit II family for censored GAM

Description

This function implements the Tobit II family for the mgcv package.

Usage

tobit2(link=list("identity","identity","log","logit2" ), 
censoring = FALSE, rho=NULL, eps = 1e-3)

Arguments

`link`	The link functions: Corresponds to `mu1`, `mu2`, `sigma` and `rho` respectively.
`censoring`	Vector of TRUE/FALSE values to denote censorship. TRUE values are censored
`rho`	Value of rho. If NULL, is estimated.
`eps`	Parameter to perturb rho in estimation if very close to -1 or 1.

Details

Under the Tobit II model, given a value sigma and a conditional mean of mu1, and a censoring parameter mu2, response values are censored if mu2 + epsilon2 < 0, and mu1 + sigma*epsilon1 otherwise.

Here epsilon1, epsilon2 are distributed Normal(0, 1) with correlated rho.

This function allows a non-linear relationship be estimated between mu1, mu2, sigma, rho and the covariates in a restricted maximum likelihood approach, via application of Wood (2016). Note that this allows for heteroskedastic errors.

Estimation of rho depends on the distributional qualities near the censorship boundary, and is hence typically very inaccurate for typical sample sizes. Hence in practice it is often better to supply a value of rho (for example 0 to imply independent censorship) instead. eps is used when estimating rho to avoid errors when rho is close to 1 or -1. Smaller values may produce more accurate results.

This method is still currently very *experimental*. It's not suggested to be used to important applications. Errors can occur if the default starting point for the function cause problems, consider changing the start argument to gam.

Value

An object inheriting from class family for use with the mgcv package.

References

Wood, S.N., N. Pya and B. Saefken (2016), Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association. <URL: http://arxiv.org/abs/1511.03864>

Examples

# Generate a small example
set.seed(1)
x <- matrix(2*rnorm(400), 200)
yn <-  x[,1]^2 + x[,2]
y <- yn + rnorm(200)
censored <- (rnorm(200) + 2*x[,2]+1) < 0 #censored according to x[,2]
ycensored <- replace(y, censored, 0) 
m <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) , ~x[,1]+x[,2], ~1,~1), 
family = tobit2(censoring = censored)) #estimated rho
par(mfrow = c(3,2))
plot(gam(y ~ s(x[,1]) + s(x[,2]) ), ylim=c(-5, 5), main = "True")
plot(m, ylim = c(-5, 5), main = "Tobit II estimated rho") 

summary(m) 
m$fitted #gives for each observation fitted mu1, mu2, sigma, rho

m2 <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) , ~x[,1]+x[,2], ~1), 
family = tobit2(censoring = censored, rho=0)) #non estimated rho
plot(m2, ylim = c(-5, 5), main = "Tobit II fixed rho") 

## Not run: 
#Larger example
set.seed(1)
x <- matrix(2*rnorm(1500), 500)
yn <- 2*x[,3] + 4*cos(x[,1]*2)
y <- yn + 3*rnorm(500)
censored <- (rnorm(500) + 2*x[,2]) < 0 #censored according to x[,2]
ycensored <- replace(y, censored, 0)  

par(mfrow = c(3,3))

# True model
plot(gam(y ~ s(x[,1]) + s(x[,2]) + s(x[, 3])), ylim=c(-5, 5), main = "True")

# Naive estimation
plot(gam(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3])), ylim=c(-5, 5), main = "Naive")

# Tobit II estimation
m <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3]), ~x[,1]+x[,2]+x[,3], ~1,~1), 
family = tobit2(censoring = censored))
plot(m, ylim = c(-5, 5), main = "Tobit II") 

#fitting with non-estimated rho
m2 <- gam(c(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3]), ~x[,1]+x[,2]+x[,3],~1), 
family = tobit2(censoring = censored, rho=0))

## End(Not run)

[Package cenGAM version 0.5.3 Index]