modreg {dirttee}R Documentation

Mode-regression for right-censored data

Description

This function implements semiparametric kernel-based mode regression for right-censored or full data.

Usage

modreg(
  formula,
  data = NULL,
  bw = c("Pseudo", "Plugin"),
  lambda = NULL,
  KMweights = NULL,
  control = NULL
)

Arguments

formula

A formula object, with the response on the left of the ‘~’ operator, and the terms on the right. The response must be a Surv object as returned by the Surv function. Only right censored data are allowed.

data

A data set on which the regression should be performed on. It should consist of columns that have the names of the specific variables defined in formula. If NULL, the function will look for the data in the environment given by the formula argument.

bw

String, either "Pseudo", "Plugin" or a fixed numerical value. This determines how bandwidth should be estimated. "Plugin" only recommended for uncensored linear mode regression.

lambda

Penalty term for penalized splines. Will be estimated if NULL.

KMweights

numerical vector, should be the same length as the response. Inverse probability of censoring weights can be provided here. They will be calculated if NULL.

control

A call to control. Various control parameters can be supplied here.

Details

Fits mode regression in an iteratively weighted least squares approach. A detailed description of the approach and algorithm can be found in Seipp et al. (2022). In short, kernel-based mode regression leads to minimization of weighted least squares, if the normal kernel is assumed. We use gam for estimation in each iteration. Mode regression is extended to right-censored time-to event data with inverse probability of censoring weights. Hyperparameters (bandwidth, penalty) are determined with a pseudo-likelihood approach for bw = "Pseudo". For "Plugin", plug-in bandwidth selection is performed, as described in Yao and Li (2014). However, this is only justified for uncensored data and mode regression with linear covariate trends or known transformations.

The event time has to be supplied using the Surv function. Positive event times with multiplicative relationships should be logarithmized beforehand. Nonlinear trends can be estimated with P-splines, indicated by using s(covariate, bs = "ps"). This will be passed down to gam, which is why the same notation is used. Other smooth terms are not tested yet. The whole gam object will be returned but standard errors and other information are not valid. boot.modreg can be used for calculation of standard errors and confidence intervals.

Value

This function returns a list with the following properties:

reg

object of class gam. Should be interpreted with care.

bw

The used bandwidth.

converged

logical. Whether or not the iteratively weighted least squares algorithm converged.

iterations

the number of iterations of the final weighted least squares fit

cova

Covariance matrix. Only supplied in case of linear terms and plug-in bandwidth.

KMweights

double vector. Weights used.

called

list. The arguments that were provided.

aic

Pseudo AIC.

pseudologlik

Pseudo log-likelihood.

edf

Effective degrees of freedom

delta

vector. Indicating whether an event has occured (1) or not (0) in the input data.

response

vector with response values

hp_opt

Summary of hyperparameter estimation.

References

Seipp, A., Uslar, V., Weyhe, D., Timmer, A., & Otto-Sobotka, F. (2022). Flexible Semiparametric Mode Regression for Time-to-Event Data. Manuscript submitted for publication.
Yao, W., & Li, L. (2014). A new regression model: modal linear regression. Scandinavian Journal of Statistics, 41(3), 656-671.

Examples



data(colcancer)
colcancer80 <- colcancer[1:80, ]

# linear trend
regL <- modreg(Surv(logfollowup, death) ~ sex + age, data = colcancer80)
summary(regL)


# mode regression with P-splines. Convergence criteria are changed to speed up the function
reg <- modreg(Surv(logfollowup, death) ~ sex + s(age, bs = "ps"), data = colcancer80, 
control = modreg.control(tol_opt = 10^-2, tol_opt2 = 10^-2, tol = 10^-3))
summary(reg)
plot(reg)

# with a fixed penalty
reg2 <- modreg(Surv(logfollowup, death) ~ sex + s(age, bs = "ps"), data = colcancer80, lambda = 0.1)

# for linear effects and uncensored data, we can use the plug-in bandwidth
regP <- modreg(age ~ sex, data = colcancer, bw = "Plugin")




[Package dirttee version 1.0.1 Index]