ARCensReg {ARCensReg}R Documentation

Censored linear regression model with autoregressive errors

Description

It fits a univariate left, right, or interval censored linear regression model with autoregressive errors under the normal distribution, using the SAEM algorithm. It provides estimates and standard errors of the parameters, supporting missing values on the dependent variable.

Usage

ARCensReg(cc, lcl = NULL, ucl = NULL, y, x, p = 1, M = 10, 
  perc = 0.25, MaxIter = 400, pc = 0.18, tol = 1e-04, 
  show_se = TRUE, quiet = FALSE)

Arguments

cc

Vector of censoring indicators of length n, where n is the total observations. For each observation: 0 if non-censored, 1 if censored/missing.

lcl, ucl

Vectors of length n that represent the lower and upper bounds of the interval, which contains the observade value of the censored observation. Default=NULL, indicating no-censored data. See details for more information.

y

Vector of responses of length n.

x

Matrix of covariates of dimension n \times l, where l is the number of fixed effects including the intercept, if considered (in models which include an intercept, x should contain a column of ones).

p

Order of the autoregressive process. It must be a positive integer value.

M

Size of the Monte Carlo sample generated in each step of the SAEM algorithm. Default=10.

perc

Percentage of burn-in on the Monte Carlo sample. Default=0.25.

MaxIter

The maximum number of iterations of the SAEM algorithm. Default=400.

pc

Percentage of initial iterations of the SAEM algorithm with no memory. It is recommended that 50<MaxIter*pc<100. Default=0.18.

tol

The convergence maximum error permitted.

show_se

TRUE or FALSE. Indicates if the standard errors should be estimated. Default=TRUE.

quiet

TRUE or FALSE. Indicates if printing information should be suppressed. Default=FALSE.

Details

The linear regression model with autocorrelated errors, defined as a discrete-time autoregressive (AR) process of order p, at time t is given by

Y_t = x_t^T \beta + \xi_t,

\xi_t = \phi_1 \xi_{t-1} + ... + \phi_p \xi_{t-p} + \eta_t, t=1, ..., n,

where Y_t is the response variable, \beta = (\beta_1, ..., \beta_l)^T is a vector of regression parameters of dimension l, and x_t = (x_{t1}, ..., x_{tl})^T is a vector of non-stochastic regressor variables values; \xi_t is the AR error with Gaussian disturbance \eta_t, \phi = (\phi_1, ..., \phi_p)^T is the vector of AR coefficients, and n is the sample size.

It is assumed that Y_t is not fully observed for all t. For left censored observations, we have lcl=-Inf and ucl=V_t, such that the true value Y_t \leq V_t. For right censoring, lcl=V_t and ucl=Inf, such that Y_t \geq V_t. For interval censoring, lcl and ucl must be finite values, such that V_{1t} \leq Y_t \leq V_{2t}. Missing data can be defined by setting lcl=-Inf and ucl=Inf.

The initial values are obtained by ignoring censoring and applying maximum likelihood estimation with the censored data replaced by their censoring limits. Furthermore, just set cc as a vector of zeros to fit a regression model with autoregressive errors for non-censored data.

Value

An object of class "ARpCRM", representing the AR(p) censored regression normal fit. Generic functions such as print and summary have methods to show the results of the fit. The function plot provides convergence graphics for the parameters when at least one censored observation exists.

Specifically, the following components are returned:

beta

Estimate of the regression parameters.

sigma2

Estimated variance of the white noise process.

phi

Estimate of the autoregressive parameters.

pi1

Estimate of the first p partial autocorrelations.

theta

Vector of parameters estimate (\beta, \sigma^2, \phi).

SE

Vector of the standard errors of (\beta, \sigma^2, \phi).

loglik

Log-likelihood value.

AIC

Akaike information criterion.

BIC

Bayesian information criterion.

AICcorr

Corrected Akaike information criterion.

yest

Augmented response variable based on the fitted model.

yyest

Final estimative of E(Y%*%t(Y)).

x

Matrix of covariates of dimension n \times l.

iter

Number of iterations until convergence.

criteria

Attained criteria value.

call

The ARCensReg call that produced the object.

tab

Table of estimates.

critFin

Selection criteria.

cens

"left", "right", or "interval" for left, right, or interval censoring, respectively.

nmiss

Number of missing observations.

ncens

Number of censored observations.

converge

Logical indicating convergence of the estimation algorithm.

MaxIter

The maximum number of iterations used for the SAEM algorithm.

M

Size of the Monte Carlo sample generated in each step of the SAEM algorithm.

pc

Percentage of initial iterations of the SAEM algorithm with no memory.

time

Time elapsed in processing.

plot

A list containing convergence information.

Author(s)

Fernanda L. Schumacher, Katherine L. Valeriano, Victor H. Lachos, Christian E. Galarza, and Larissa A. Matos

References

Delyon B, Lavielle M, Moulines E (1999). “Convergence of a stochastic approximation version of the EM algorithm.” Annals of statistics, 94–128.

Schumacher FL, Lachos VH, Dey DK (2017). “Censored regression models with autoregressive errors: A likelihood-based perspective.” Canadian Journal of Statistics, 45(4), 375–392.

See Also

arima, ARtCensReg, InfDiag

Examples

## Example 1: (p = l = 1)
# Generating a sample
set.seed(23451)
n = 50
x = rep(1, n)
dat = rARCens(n=n, beta=2, phi=.5, sig2=.3, x=x, cens='left', pcens=.1)

# Fitting the model (quick convergence)
fit0 = ARCensReg(dat$data$cc, dat$data$lcl, dat$data$ucl, dat$data$y, x,
                 M=5, pc=.12, tol=0.001, show_se=FALSE)
fit0

## Example 2: (p = l = 2)
# Generating a sample
n = 100
x = cbind(1, runif(n))
dat = rARCens(n=n, beta=c(2,1), phi=c(.48,-.2), sig2=.5, x=x, cens='left', 
              pcens=.05)

# Fitting the model
fit1 = ARCensReg(dat$data$cc, dat$data$lcl, dat$data$ucl, dat$data$y, x,
                 p=2, tol=0.0001)
summary(fit1)
plot(fit1)

# Plotting the augmented variable
library(ggplot2)
data.plot = data.frame(yobs=dat$data$y, yest=fit1$yest)
ggplot(data.plot) + theme_bw() +
  geom_line(aes(x=1:nrow(data.plot), y=yest), color=4, linetype="dashed") +
  geom_line(aes(x=1:nrow(data.plot), y=yobs)) + labs(x="Time", y="y")

## Example 3: Simulating missing values
miss = sample(1:n, 3)
yMISS = dat$data$y
yMISS[miss] = NA
cc = dat$data$cc
cc[miss] = 1
lcl = dat$data$lcl
ucl = dat$data$ucl
ucl[miss] = Inf

fit2 = ARCensReg(cc, lcl, ucl, yMISS, x, p=2)
plot(fit2)

# Imputed missing values
data.frame(yobs=dat$data$y[miss], yest=fit2$yest[miss])

[Package ARCensReg version 3.0.1 Index]