modreg {dirttee} | R Documentation |
Mode-regression for right-censored data
Description
This function implements semiparametric kernel-based mode regression for right-censored or full data.
Usage
modreg(
formula,
data = NULL,
bw = c("Pseudo", "Plugin"),
lambda = NULL,
KMweights = NULL,
control = NULL
)
Arguments
formula |
A formula object, with the response on the left of the ‘~’
operator, and the terms on the right. The response must be a
|
data |
A data set on which the regression should be performed on.
It should consist of columns that have the names of the specific variables
defined in |
bw |
String, either " |
lambda |
Penalty term for penalized splines. Will be estimated if |
KMweights |
numerical vector, should be the same length as the response. Inverse probability of censoring weights can be provided here. They will be calculated if |
control |
A call to |
Details
Fits mode regression in an iteratively weighted least squares approach. A detailed description of
the approach and algorithm can be found in Seipp et al. (2022). In short, kernel-based mode regression leads
to minimization of weighted least squares, if the normal kernel is assumed. We use gam for estimation in each iteration.
Mode regression is extended to right-censored time-to event data with inverse probability of censoring weights.
Hyperparameters (bandwidth, penalty) are determined with a pseudo-likelihood approach for bw = "Pseudo"
.
For "Plugin", plug-in bandwidth selection is performed, as described in Yao and Li (2014). However, this is only justified for uncensored data
and mode regression with linear covariate trends or known transformations.
The event time has to be supplied using the Surv
function. Positive event times with multiplicative relationships should be logarithmized
beforehand. Nonlinear trends can be estimated with P-splines, indicated by using s(covariate, bs = "ps")
. This will be passed down to gam, which is why
the same notation is used. Other smooth terms are not tested yet. The whole gam object will be returned but standard errors and other information are not
valid. boot.modreg
can be used for calculation of standard errors and confidence intervals.
Value
This function returns a list with the following properties:
reg |
object of class gam. Should be interpreted with care. |
bw |
The used bandwidth. |
converged |
logical. Whether or not the iteratively weighted least squares algorithm converged. |
iterations |
the number of iterations of the final weighted least squares fit |
cova |
Covariance matrix. Only supplied in case of linear terms and plug-in bandwidth. |
KMweights |
double vector. Weights used. |
called |
list. The arguments that were provided. |
aic |
Pseudo AIC. |
pseudologlik |
Pseudo log-likelihood. |
edf |
Effective degrees of freedom |
delta |
vector. Indicating whether an event has occured (1) or not (0) in the input data. |
response |
vector with response values |
hp_opt |
Summary of hyperparameter estimation. |
References
Seipp, A., Uslar, V., Weyhe, D., Timmer, A., & Otto-Sobotka, F. (2022). Flexible Semiparametric Mode Regression for Time-to-Event Data. Manuscript submitted for publication.
Yao, W., & Li, L. (2014). A new regression model: modal linear regression. Scandinavian Journal of Statistics, 41(3), 656-671.
Examples
data(colcancer)
colcancer80 <- colcancer[1:80, ]
# linear trend
regL <- modreg(Surv(logfollowup, death) ~ sex + age, data = colcancer80)
summary(regL)
# mode regression with P-splines. Convergence criteria are changed to speed up the function
reg <- modreg(Surv(logfollowup, death) ~ sex + s(age, bs = "ps"), data = colcancer80,
control = modreg.control(tol_opt = 10^-2, tol_opt2 = 10^-2, tol = 10^-3))
summary(reg)
plot(reg)
# with a fixed penalty
reg2 <- modreg(Surv(logfollowup, death) ~ sex + s(age, bs = "ps"), data = colcancer80, lambda = 0.1)
# for linear effects and uncensored data, we can use the plug-in bandwidth
regP <- modreg(age ~ sex, data = colcancer, bw = "Plugin")