R: Mode-regression for right-censored data

modreg {dirttee}

R Documentation

Mode-regression for right-censored data

Description

This function implements semiparametric kernel-based mode regression for right-censored or full data.

Usage

modreg(
  formula,
  data = NULL,
  bw = c("Pseudo", "Plugin"),
  lambda = NULL,
  KMweights = NULL,
  control = NULL
)

Arguments

`formula`	A formula object, with the response on the left of the ‘~’ operator, and the terms on the right. The response must be a `Surv` object as returned by the `Surv` function. Only right censored data are allowed.
`data`	A data set on which the regression should be performed on. It should consist of columns that have the names of the specific variables defined in `formula`. If `NULL`, the function will look for the data in the environment given by the `formula` argument.
`bw`	String, either "`Pseudo`", "`Plugin`" or a fixed numerical value. This determines how bandwidth should be estimated. "`Plugin`" only recommended for uncensored linear mode regression.
`lambda`	Penalty term for penalized splines. Will be estimated if `NULL`.
`KMweights`	numerical vector, should be the same length as the response. Inverse probability of censoring weights can be provided here. They will be calculated if `NULL`.
`control`	A call to `control`. Various control parameters can be supplied here.

Details

Fits mode regression in an iteratively weighted least squares approach. A detailed description of the approach and algorithm can be found in Seipp et al. (2022). In short, kernel-based mode regression leads to minimization of weighted least squares, if the normal kernel is assumed. We use gam for estimation in each iteration. Mode regression is extended to right-censored time-to event data with inverse probability of censoring weights. Hyperparameters (bandwidth, penalty) are determined with a pseudo-likelihood approach for bw = "Pseudo". For "Plugin", plug-in bandwidth selection is performed, as described in Yao and Li (2014). However, this is only justified for uncensored data and mode regression with linear covariate trends or known transformations.

The event time has to be supplied using the Surv function. Positive event times with multiplicative relationships should be logarithmized beforehand. Nonlinear trends can be estimated with P-splines, indicated by using s(covariate, bs = "ps"). This will be passed down to gam, which is why the same notation is used. Other smooth terms are not tested yet. The whole gam object will be returned but standard errors and other information are not valid. boot.modreg can be used for calculation of standard errors and confidence intervals.

Value

This function returns a list with the following properties:

`reg`	object of class gam. Should be interpreted with care.
`bw`	The used bandwidth.
`converged`	logical. Whether or not the iteratively weighted least squares algorithm converged.
`iterations`	the number of iterations of the final weighted least squares fit
`cova`	Covariance matrix. Only supplied in case of linear terms and plug-in bandwidth.
`KMweights`	double vector. Weights used.
`called`	list. The arguments that were provided.
`aic`	Pseudo AIC.
`pseudologlik`	Pseudo log-likelihood.
`edf`	Effective degrees of freedom
`delta`	vector. Indicating whether an event has occured (1) or not (0) in the input data.
`response`	vector with response values
`hp_opt`	Summary of hyperparameter estimation.

References

Seipp, A., Uslar, V., Weyhe, D., Timmer, A., & Otto-Sobotka, F. (2022). Flexible Semiparametric Mode Regression for Time-to-Event Data. Manuscript submitted for publication.
Yao, W., & Li, L. (2014). A new regression model: modal linear regression. Scandinavian Journal of Statistics, 41(3), 656-671.

Examples



data(colcancer)
colcancer80 <- colcancer[1:80, ]

# linear trend
regL <- modreg(Surv(logfollowup, death) ~ sex + age, data = colcancer80)
summary(regL)

# mode regression with P-splines. Convergence criteria are changed to speed up the function
reg <- modreg(Surv(logfollowup, death) ~ sex + s(age, bs = "ps"), data = colcancer80, 
control = modreg.control(tol_opt = 10^-2, tol_opt2 = 10^-2, tol = 10^-3))
summary(reg)
plot(reg)

# with a fixed penalty
reg2 <- modreg(Surv(logfollowup, death) ~ sex + s(age, bs = "ps"), data = colcancer80, lambda = 0.1)

# for linear effects and uncensored data, we can use the plug-in bandwidth
regP <- modreg(age ~ sex, data = colcancer, bw = "Plugin")

[Package dirttee version 1.0.2 Index]