R: Train a Multi-Task Logistic Regression (MTLR) Model

mtlr {MTLR}

R Documentation

Train a Multi-Task Logistic Regression (MTLR) Model

Description

Trains a MTLR model for survival prediction. Right, left, and interval censored data are all supported.

Usage

mtlr(formula, data, time_points = NULL, nintervals = NULL,
  normalize = T, C1 = 1, train_biases = T, train_uncensored = T,
  seed_weights = NULL, threshold = 1e-05, maxit = 5000,
  lower = -15, upper = 15)

Arguments

`formula`	a formula object with the response to the left of the "~" operator. The response must be a survival object returned by the `Surv` function.
`data`	a data.frame containing the features for survival prediction. These must be variables corresponding to the formula object.
`time_points`	the time points for MTLR to create weights. If left as NULL, the time_points chosen will be based on equally spaced quantiles of the survival times. In the case of interval censored data note that only the start time is considered and not the end time for selecting time points. It is strongly recommended to specify time points if your data is heavily interval censored. If time_points is not NULL then nintervals is ignored.
`nintervals`	Number of time intervals to use for MTLR. Note the number of time points will be nintervals + 1. If left as NULL a default of sqrt(N) is used where N is the number of observations in the supplied dataset. This parameter is ignored if time_points is specified.
`normalize`	if TRUE, variables will be normalized (mean 0, standard deviation of 1). This is STRONGLY suggested. If normalization does not occur it is much more likely that MTLR will fail to converge. Additionally, if FALSE consider adjusting "lower" and "upper" used for L-BFGS-B optimization.
`C1`	The L2 regularization parameter for MTLR. C1 can also be selected via `mtlr_cv`. See "Learning Patient-Specific Cancer Survival Distributions as a Sequence of Dependent Regressors" by Yu et al. (2011) for details.
`train_biases`	if TRUE, biases will be trained before feature weights (and again trained while training feature weights). This has shown to speed up total training time.
`train_uncensored`	if TRUE, one round of training will occur assuming all event times are uncensored. This is done due to the non-convexity issue that arises in the presence of censored data. However if ALL data is censored we recommend setting this option to FALSE as it has shown to give poor results in this case.
`seed_weights`	the initialization weights for the biases and the features. If left as NULL all weights are initialized to zero. If seed_weights are specified then either nintervals or time_points must also be specified. The length of seed_weights should correspond to (number of features + 1)(length of time_points) = (number of features + 1)(nintervals + 1).
`threshold`	The threshold for the convergence tolerance (in the objective function) when training the feature weights. This threshold will be passed to optim.
`maxit`	The maximum iterations to run for MTLR. This parameter will be passed to optim.
`lower`	The lower bound for L-BFGS-B optimization. This parameter will be passed to optim.
`upper`	The upper bound for L-BFGS-B optimization. This parameter will be passed to optim.

Details

This function allows one to train an MTLR model given a dataset containing survival data. mtlr uses the Limited-Memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS-B) approximation method to train feature weights. This training is outsourced to the internal optim function in R. Currently only a few parameters (namely threshold, maxit,lower, upper) of optim are supported, more will likely become available in the future.

Weights are initialized to 0 prior to training. Under default settings, the bias weights will be trained before considering feature weights. As Yu et al. (2011) specified, the introduction of censored observations creates a non-convex loss function. To address this, weights are first trained assuming all event times are uncensored. Once these starting weights have been trained another round of training is performed using the true values of the event indicator (censored/uncensored). However, in the event of all censored data this has shown to negatively effect the results. If all data is censored (either left, right, or interval2) we suggest setting train_uncensored = FALSE.

Yu et al. (2011) actually suggested two regularization parameters, C1 to control the size of the feature weights and C2 to control the smoothness. In Ping Jin's masters thesis (Using Survival Prediction Techniques to Learn Consumer-Specific Reservation Price Distributions) he showed that C2 is not required for smoothness and C1 will suffice (Appendix A.2) so we do not support the C2 parameter in this implementation.

If an error occurs from optim it is likely the weights are getting too large. Including fewer time points (or specifying better time points) in addition to changing the lower/upper bounds of L-BFGS-B may resolve these issues. The most common failure has been that the objective value sees infinite values due to extremely large feature weights.

Censored data: Right, left, and interval censored data are all supported both separately and mixed. The convention to input these types of data follows the Surv object format. Per the Surv documentation, "The [interval2] approach is to think of each observation as a time interval with (-infinity, t) for left censored, (t, infinity) for right censored, (t,t) for exact and (t1, t2) for an interval. This is the approach used for type = interval2. Infinite values can be represented either by actual infinity (Inf) or NA." See the examples below for an example of inputting this type of data.

Value

An mtlr object returns the following:

weight_matrix: The matrix of feature weights determined by MTLR.
x: The dataframe of features (response removed). Note observations with missing values will have been removed (this is the dataset on which MTLR was trained).
y: The matrix of response values MTLR uses for training. Each column corresponds to an observation and rows as time points. A value of 1 indicates a observation was either censored or had their event occur by that time.
response: The response as a Surv object (specified by formula).
time_points: The timepoints selected and used to train MTLR.
C1: The regularization parameter used.
Call: The original call to mtlr.
Terms: The x-value terms used in mtlr. These are later used in predict.mtlr
scale: The means and standard deviations of features when normalize = TRUE. These are used in predict.mtlr. Will be NULL if normalize = FALSE.
xlevels: The levels of the features used. This is used again by predict.mtlr.

Examples

#Access the Surv function and the leukemia/lung dataset.
library(survival)
simple_mod <- mtlr(Surv(time,status)~., data = leukemia)
simple_mod

bigger_mod <- mtlr(Surv(time,status)~., data = lung)
bigger_mod

#Note that observations with missing data were removed:
nrow(lung)
nrow(bigger_mod$x)


# Mixed censoring types
time1 = c(NA, 4, 7, 12, 10, 6, NA, 3) #NA for right censored
time2 = c(14, 4, 10, 12, NA, 9, 5, NA) #NA for left censored
#time1 == time2 indicates an exact death time. time2> time1 indicates interval censored.
set.seed(42)
dat = cbind.data.frame(time1, time2, importantfeature = rnorm(8))
formula = Surv(time1,time2,type = "interval2")~.
mixedmod = mtlr(formula, dat)

[Package MTLR version 0.2.1 Index]