R: Skew-t quantile regression for censored and missing data

cens.lqr {lqr}

R Documentation

Skew-t quantile regression for censored and missing data

Description

It fits a linear quantile regression model where the error term is considered to follow an SKT skew-t distribution, that is, the one proposed by Wichitaksorn et.al. (2014). Additionally, the model is capable to deal with missing and interval-censored data at the same time. Degrees of freedom can be either estimated or supplied by the user. It offers estimates and full inference. It also provides envelopes plots and likelihood-based criteria for assessing the fit, as well as fitted and imputed values.

Usage

cens.lqr(y,x,cc,LL,UL,p=0.5,nu=NULL,precision=1e-06,envelope=FALSE)

Arguments

`y`	the response vector of dimension `n` where `n` is the total of observations. It may contain both missing and censored values represented by `NaN`s.
`x`	design matrix for the fixed effects of dimension `N x d` where `d` represents the number of fixed effects including the intercept, if considered.
`cc`	vector of censoring/missing indicators. For each observation it takes 0 if non-censored/missing, 1 if censored/missing.
`LL`	the vector of lower limits of dimension `n`x`1`. See details section.
`UL`	the vector of upper limits of dimension `n`x`1`. See details section.
`p`	An unique quantile of interest to fit the quantile regression.
`nu`	It represents the degrees of freedom of the skew-t distribution. When is not provided, we use the MLE.
`precision`	The convergence maximum error permitted. By default is 10^-6.
`envelope`	if `TRUE`, it will show a confidence envelope for a curve based on bootstrap replicates. it is `FALSE` by default.

Details

Missing or censored values in the response can be represented imputed as NaNs, since the algorithm only uses the information provided in the lower and upper limits LL and UL. The indicator vector cc must take the value of 1 for these observations.

*Censored and missing data*

If all lower limits are -Inf, we will be dealing with left-censored data. Besides, if all upper limits are Inf, this is the case of right-censored data. Interval-censoring is considered when both limits are finites. If some observation is missing, we have not information at all, so both limits must be infinites.

Combinations of all cases above are permitted, that is, we may have left-censored, right-censored, interval-censored and missing data at the same time.

Value

`iter`	number of iterations.
`criteria`	attained criteria value.
`beta`	fixed effects estimates.
`sigma`	scale parameter estimate for the error term.
`nu`	Estimate of `nu` parameter detailed above.
`SE`	Standard Error estimates.
`table`	Table containing the inference for the fixed effects parameters.
`loglik`	Log-likelihood value.
`AIC`	Akaike information criterion.
`BIC`	Bayesian information criterion.
`HQ`	Hannan-Quinn information criterion.
`fitted.values`	vector containing the fitted values.
`imputed.values`	vector containing the imputed values for censored/missing observations.
`residuals`	vector containing the residuals.

Author(s)

Christian E. Galarza <chedgala@espol.edu.ec>, Marcelo Bourguignon <m.p.bourguignon@gmail.com> and Victor H. Lachos <hlachos@ime.unicamp.br>

Maintainer: Christian E. Galarza <chedgala@espol.edu.ec>

References

Galarza, C., Lachos, V. H. & Bourguignon M. (2021). A skew-t quantile regression for censored and missing data. Stat.doi:10.1002/sta4.379.

Galarza, C., Lachos, V. H., Cabral, C. R. B., & Castro, C. L. (2017). Robust quantile regression using a generalized class of skewed distributions. Stat, 6(1), 113-130.

Wichitaksorn, N., Choy, S. B., & Gerlach, R. (2014). A generalized class of skew distributions and associated robust quantile regression models. Canadian Journal of Statistics, 42(4), 579-596.

Examples


##Load the data
data(ais)
attach(ais)

##Setting
y<-BMI
x<-cbind(1,LBM,Sex)

cc = rep(0,length(y))
LL = UL = rep(NA,length(y))

#Generating a 5% of interval-censored values
ind = sample(x = c(0,1),size = length(y),
replace = TRUE,prob = c(0.95,0.05))
ind1 = (ind == 1)

cc[ind1] = 1
LL[ind1] = y[ind1] - 10
UL[ind1] = y[ind1] + 10
y[ind1] = NA #deleting data

#Fitting the model

# A median regression with unknown degrees of freedom
out = cens.lqr(y,x,cc,LL,UL,p=0.5,nu = NULL,precision = 1e-6,envelope = TRUE)

# A first quartile regression with 10 degrees of freedom
out = cens.lqr(y,x,cc,LL,UL,p=0.25,nu = 10,precision = 1e-6,envelope = TRUE)

[Package lqr version 5.2 Index]