tobit1 {cenGAM}R Documentation

Tobit I family for censored GAM


This function implements the Tobit I family for the mgcv package.


tobit1(link="identity", left.threshold=-Inf, right.threshold=Inf, 
theta=NULL, initial.theta=0)



The link function: one of "log", "identity", "inverse", "sqrt", or a power link.


Threshold value, for which response values below this will be censored. Can be a scalar or a vector of same length as the response.


Threshold value, for which response values above this will be censored. Can be a scalar or a vector of same length as the response.


The log std error. If left NULL, it is estimated.


Optional parameter to set initial theta value if it is being estimated - change this value if the variance is very different from 1.


Under the Tobit I model, given a value sigma and a conditional mean of mu, and left and right threshold values lt and rt, response values y are distributed as

lt if z < lt

rt if z > rt

z otherwise

where z is distribution Normal(mu, sigma). (Note that we have extended the model to include right as well as left censoring.

This function allows a non-linear relationship be estimated between mu and the covariates in a restricted maximum likelihood approach, via application of Wood (2016). We allow differing levels of censorship across dataset by allowing the left and right thresholds to be different between data points.

See the examples for more details of how to fit in practice.


An object inheriting from class family for use with the mgcv package.


Wood, S.N., N. Pya and B. Saefken (2016), Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association. <URL:>

See Also



# Generate random data
x <- matrix(2*rnorm(300), 100)
yn <- 2*x[,3] + 4*cos(x[,1]*2)
y <- yn + rnorm(100)
ycensored <- pmax(y, 0) # data left-censored at 0
ycensored <- pmin(ycensored, 4) # data right-censored at 4

par(mfrow = c(3,3))

# True model
plot(gam(y ~ s(x[,1]) + s(x[,2]) + s(x[, 3])), ylim=c(-5, 5), main = "True")

# Naive estimation
plot(gam(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3])), ylim=c(-5, 5), main = "Naive")

# Tobit I estimation
m <- gam(ycensored ~ s(x[,1]) + s(x[,2]) + s(x[, 3]), family = tobit1(left.threshold=0))
summary(m) #note x[,2] is not significant
m$family$getTheta(FALSE) #estimate of theta
m$family$getTheta(TRUE) #estimate of sigma = exp(theta)
plot(m, ylim = c(-5, 5), main = "Tobit I") 

# More realistic dataset requires some data processing
# For this dataset, all values >1 are censored. At location D values are censored below at 0, other
# locations are censored at 0.1.
nut = data.frame(day = nutrient$day, location = factor(nutrient$location), y=as.numeric(nutrient$y))
nut$upper = 10
nut$lower = ifelse(nut$location == "D", 0, 1)
# Recode the data to communicate which is and isn't censored
nut$y[nutrient$y %in% c("<1", "<0")] = -Inf
nut$y[nutrient$y %in% c(">10")] = Inf
# Missing values are best removed here, or can cause confusion later
nut = na.omit(nut)

# Fit including a random effect for location
m = gam(y~ s(day) + s(location, bs="re") , data = nut, 
family = tobit1(left.threshold=nut$lower, right.threshold = nut$upper))


[Package cenGAM version 0.5.3 Index]