tobit {VGAM} | R Documentation |
Tobit Regression
Description
Fits a Tobit regression model.
Usage
tobit(Lower = 0, Upper = Inf, lmu = "identitylink",
lsd = "loglink", imu = NULL, isd = NULL,
type.fitted = c("uncensored", "censored", "mean.obs"),
byrow.arg = FALSE, imethod = 1, zero = "sd")
Arguments
Lower |
Numeric. It is the value |
Upper |
Numeric. It is the value |
lmu , lsd |
Parameter link functions for the mean and
standard deviation parameters.
See |
imu , isd , byrow.arg |
See |
type.fitted |
Type of fitted value returned.
The first choice is default and is the ordinary uncensored or
unbounded linear model.
If |
imethod |
Initialization method. Either 1 or 2 or 3, this specifies
some methods for obtaining initial values for the parameters.
See |
zero |
A vector, e.g., containing the value 1 or 2. If so,
the mean or standard deviation respectively are modelled
as an intercept-only.
Setting |
Details
The Tobit model can be written
y_i^* = x_i^T \beta + \varepsilon_i
where the e_i \sim N(0,\sigma^2)
independently and i=1,\ldots,n
.
However, we measure y_i = y_i^*
only if
y_i^* > L
and
y_i^* < U
for some
cutpoints L
and U
.
Otherwise we let y_i=L
or
y_i=U
, whatever is closer.
The Tobit model is thus a multiple linear regression
but with censored
responses if it is below or above certain cutpoints.
The defaults for Lower
and Upper
and
lmu
correspond to the standard Tobit model.
Fisher scoring is used for the standard and nonstandard
models.
By default, the mean x_i^T \beta
is
the first linear/additive predictor, and the log of
the standard deviation is the second linear/additive
predictor. The Fisher information matrix for uncensored
data is diagonal. The fitted values are the estimates
of x_i^T \beta
.
Value
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions
such as vglm
,
and vgam
.
Warning
If values of the response and Lower
and/or Upper
are not integers then there is the danger that the value is
wrongly interpreted as uncensored.
For example, if the first 10 values of the response were
runif(10)
and Lower
was assigned these value then
testing y[1:10] == Lower[1:10]
is numerically fraught.
Currently, if any y < Lower
or y > Upper
then
a warning is issued.
The function round2
may be useful.
Note
The response can be a matrix.
If so, then Lower
and Upper
are recycled into a matrix with the number of columns equal
to the number of responses,
and the recycling is done row-wise if
byrow.arg = TRUE
.
The default order is as matrix
, which
is byrow.arg = FALSE
.
For example, these are returned in fit4@misc$Lower
and
fit4@misc$Upper
below.
If there is no censoring then
uninormal
is recommended instead.
Any value of the
response less than Lower
or greater
than Upper
will
be assigned the value Lower
and Upper
respectively,
and a warning will be issued.
The fitted object has components censoredL
and censoredU
in the extra
slot which specifies whether
observations
are censored in that direction.
The function cens.normal
is an alternative
to tobit()
.
When obtaining initial values, if the algorithm would otherwise want to fit an underdetermined system of equations, then it uses the entire data set instead. This might result in rather poor quality initial values, and consequently, monitoring convergence is advised.
Author(s)
Thomas W. Yee
References
Tobin, J. (1958). Estimation of relationships for limited dependent variables. Econometrica 26, 24–36.
See Also
rtobit
,
cens.normal
,
uninormal
,
double.cens.normal
,
posnormal
,
CommonVGAMffArguments
,
round2
,
mills.ratio
,
margeff
,
rnorm
.
Examples
# Here, fit1 is a standard Tobit model and fit2 is nonstandard
tdata <- data.frame(x2 = seq(-1, 1, length = (nn <- 100)))
set.seed(1)
Lower <- 1; Upper <- 4 # For the nonstandard Tobit model
tdata <- transform(tdata,
Lower.vec = rnorm(nn, Lower, 0.5),
Upper.vec = rnorm(nn, Upper, 0.5))
meanfun1 <- function(x) 0 + 2*x
meanfun2 <- function(x) 2 + 2*x
meanfun3 <- function(x) 3 + 2*x
tdata <- transform(tdata,
y1 = rtobit(nn, mean = meanfun1(x2)), # Standard Tobit model
y2 = rtobit(nn, mean = meanfun2(x2), Lower = Lower, Upper = Upper),
y3 = rtobit(nn, mean = meanfun3(x2), Lower = Lower.vec,
Upper = Upper.vec),
y4 = rtobit(nn, mean = meanfun3(x2), Lower = Lower.vec,
Upper = Upper.vec))
with(tdata, table(y1 == 0)) # How many censored values?
with(tdata, table(y2 == Lower | y2 == Upper)) # Ditto
with(tdata, table(attr(y2, "cenL")))
with(tdata, table(attr(y2, "cenU")))
fit1 <- vglm(y1 ~ x2, tobit, data = tdata, trace = TRUE)
coef(fit1, matrix = TRUE)
summary(fit1)
fit2 <- vglm(y2 ~ x2,
tobit(Lower = Lower, Upper = Upper, type.f = "cens"),
data = tdata, trace = TRUE)
table(fit2@extra$censoredL)
table(fit2@extra$censoredU)
coef(fit2, matrix = TRUE)
fit3 <- vglm(y3 ~ x2, tobit(Lower = with(tdata, Lower.vec),
Upper = with(tdata, Upper.vec),
type.f = "cens"),
data = tdata, trace = TRUE)
table(fit3@extra$censoredL)
table(fit3@extra$censoredU)
coef(fit3, matrix = TRUE)
# fit4 is fit3 but with type.fitted = "uncen".
fit4 <- vglm(cbind(y3, y4) ~ x2,
tobit(Lower = rep(with(tdata, Lower.vec), each = 2),
Upper = rep(with(tdata, Upper.vec), each = 2),
byrow.arg = TRUE),
data = tdata, crit = "coeff", trace = TRUE)
head(fit4@extra$censoredL) # A matrix
head(fit4@extra$censoredU) # A matrix
head(fit4@misc$Lower) # A matrix
head(fit4@misc$Upper) # A matrix
coef(fit4, matrix = TRUE)
## Not run: # Plot fit1--fit4
par(mfrow = c(2, 2))
plot(y1 ~ x2, tdata, las = 1, main = "Standard Tobit model",
col = as.numeric(attr(y1, "cenL")) + 3,
pch = as.numeric(attr(y1, "cenL")) + 1)
legend(x = "topleft", leg = c("censored", "uncensored"),
pch = c(2, 1), col = c("blue", "green"))
legend(-1.0, 2.5, c("Truth", "Estimate", "Naive"), lwd = 2,
col = c("purple", "orange", "black"), lty = c(1, 2, 2))
lines(meanfun1(x2) ~ x2, tdata, col = "purple", lwd = 2)
lines(fitted(fit1) ~ x2, tdata, col = "orange", lwd = 2, lty = 2)
lines(fitted(lm(y1 ~ x2, tdata)) ~ x2, tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
plot(y2 ~ x2, data = tdata, las = 1, main = "Tobit model",
col = as.numeric(attr(y2, "cenL")) + 3 +
as.numeric(attr(y2, "cenU")),
pch = as.numeric(attr(y2, "cenL")) + 1 +
as.numeric(attr(y2, "cenU")))
legend(x = "topleft", leg = c("censored", "uncensored"),
pch = c(2, 1), col = c("blue", "green"))
legend(-1.0, 3.5, c("Truth", "Estimate", "Naive"), lwd = 2,
col = c("purple", "orange", "black"), lty = c(1, 2, 2))
lines(meanfun2(x2) ~ x2, tdata, col = "purple", lwd = 2)
lines(fitted(fit2) ~ x2, tdata, col = "orange", lwd = 2, lty = 2)
lines(fitted(lm(y2 ~ x2, tdata)) ~ x2, tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
plot(y3 ~ x2, data = tdata, las = 1,
main = "Tobit model with nonconstant censor levels",
col = as.numeric(attr(y3, "cenL")) + 2 +
as.numeric(attr(y3, "cenU") * 2),
pch = as.numeric(attr(y3, "cenL")) + 1 +
as.numeric(attr(y3, "cenU") * 2))
legend(x = "topleft", pch = c(2, 3, 1), col = c(3, 4, 2),
leg = c("censoredL", "censoredU", "uncensored"))
legend(-1.0, 3.5, c("Truth", "Estimate", "Naive"), lwd = 2,
col = c("purple", "orange", "black"), lty = c(1, 2, 2))
lines(meanfun3(x2) ~ x2, tdata, col = "purple", lwd = 2)
lines(fitted(fit3) ~ x2, tdata, col = "orange", lwd = 2, lty = 2)
lines(fitted(lm(y3 ~ x2, tdata)) ~ x2, tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
plot(y3 ~ x2, data = tdata, las = 1,
main = "Tobit model with nonconstant censor levels",
col = as.numeric(attr(y3, "cenL")) + 2 +
as.numeric(attr(y3, "cenU") * 2),
pch = as.numeric(attr(y3, "cenL")) + 1 +
as.numeric(attr(y3, "cenU") * 2))
legend(x = "topleft", pch = c(2, 3, 1), col = c(3, 4, 2),
leg = c("censoredL", "censoredU", "uncensored"))
legend(-1.0, 3.5, c("Truth", "Estimate", "Naive"), lwd = 2,
col = c("purple", "orange", "black"), lty = c(1, 2, 2))
lines(meanfun3(x2) ~ x2, data = tdata, col = "purple", lwd = 2)
lines(fitted(fit4)[, 1] ~ x2, tdata, col="orange", lwd = 2, lty = 2)
lines(fitted(lm(y3 ~ x2, tdata)) ~ x2, data = tdata, col = "black",
lty = 2, lwd = 2) # This is simplest but wrong!
## End(Not run)