print.discSurvPredErrDisc {discSurv}R Documentation

Prediction Error Curves

Description

Estimates prediction error curves of arbitrary discrete survival prediction models. In prediction error curves the estimated and observed survival functions are compared adjusted by weights at given timepoints.

Usage

## S3 method for class 'discSurvPredErrDisc'
print(x, ...)

## S3 method for class 'discSurvPredErrDisc'
plot(x, ...)

predErrCurve(
  timepoints,
  estSurvList,
  testTime,
  testEvent,
  trainTime,
  trainEvent
)

Arguments

x

Object of class "discSurvPredErrDisc"("class discSurvPredErrDisc")

...

Additional arguments to S3 methods.

timepoints

Vector of the number of discrete time intervals ("integer vector").

estSurvList

List of persons in the test data ("class list"). Each element contains a estimated survival functions of all given time points ("numeric vector").

testTime

Discrete survival times in the test data ("numeric vector").

testEvent

Univariate event indicator in the test data ("binary vector").

trainTime

Numeric vector of discrete survival times in the training data ("numeric vector").

trainEvent

Integer vector of univariate event indicator in the training data("integer vector").

Details

The prediction error curves should be smaller than 0.25 for all time points, because this is equivalent to a random assignment error.

Value

Author(s)

Thomas Welchowski welchow@imbie.meb.uni-bonn.de

References

Van der Laan MJ, Robins JM (2003). Unified Methods for Censored Longitudinal Data and Causality. Springer Series in Statistics.

Gerds TA, Schumacher M (2006). “Consistent estimation of the expected Brier Score in general survival models with right-censored event times.” Biometrical Journal, 48, 1029-1040.

See Also

gam

Examples


# Example with cross validation and unemployment data 
library(Ecdat)
library(mgcv)
data(UnempDur)
summary(UnempDur$spell)

# Extract subset of data
set.seed(635)
IDsample <- sample(1:dim(UnempDur)[1], 100)
UnempDurSubset <- UnempDur [IDsample, ]
head(UnempDurSubset)
range(UnempDurSubset$spell)

# Generate training and test data
set.seed(7550)
TrainIndices <- sample (x = 1:dim(UnempDurSubset) [1], size = 75)
TrainUnempDur <- UnempDurSubset [TrainIndices, ]
TestUnempDur <- UnempDurSubset [-TrainIndices, ]

# Convert to long format
LongTrain <- dataLong(dataShort = TrainUnempDur, timeColumn = "spell", eventColumn = "censor1")
LongTest <- dataLong(dataShort = TestUnempDur, timeColumn = "spell", eventColumn = "censor1")
# Convert factor to numeric for smoothing
LongTrain$timeInt <- as.numeric(as.character(LongTrain$timeInt))
LongTest$timeInt <- as.numeric(as.character(LongTest$timeInt))

######################################################################
# Estimate a generalized, additive model in discrete survival analysis

gamFit <- gam (formula = y ~ s(timeInt) + age + logwage, data = LongTrain, family = binomial())

# Estimate survival function of each person in the test data
oneMinusPredHaz <- 1 - predict(gamFit, newdata = LongTest, type = "response")
predSurv <- aggregate(oneMinusPredHaz ~ obj, data = LongTest, FUN = cumprod)

# Prediction error in first interval
tryPredErrDisc1 <- predErrCurve (timepoints = 1, 
estSurvList = predSurv [[2]], testTime = TestUnempDur$spell,
testEvent=TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
 trainEvent=TrainUnempDur$censor1)
tryPredErrDisc1

# Prediction error of the 2. to 10. interval
tryPredErrDisc2 <- predErrCurve (timepoints = 2:10,
estSurvList = predSurv [[2]], testTime = TestUnempDur$spell,
testEvent = TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
trainEvent = TrainUnempDur$censor1)
tryPredErrDisc2
plot(tryPredErrDisc2)

########################################
# Fit a random discrete survival forest

library(ranger)
LongTrainRF <- LongTrain
LongTrainRF$y <- factor(LongTrainRF$y)
rfFit <- ranger(formula = y ~ timeInt + age + logwage, data = LongTrainRF,
probability = TRUE)

# Estimate survival function of each person in the test data
oneMinusPredHaz <- 1 - predict(rfFit, data = LongTest)$predictions[, 2]
predSurv <- aggregate(oneMinusPredHaz ~ obj, data = LongTest, FUN = cumprod)

# Prediction error in first interval
tryPredErrDisc1 <- predErrCurve (timepoints = 1, 
estSurvList = predSurv [[2]], testTime = TestUnempDur$spell,
testEvent = TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
 trainEvent = TrainUnempDur$censor1)
tryPredErrDisc1

# Prediction error of the 2. to 10. interval
tryPredErrDisc2 <- predErrCurve (timepoints = 2:10,
estSurvList = predSurv [[2]], testTime = TestUnempDur$spell,
testEvent = TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
trainEvent = TrainUnempDur$censor1)
tryPredErrDisc2
plot(tryPredErrDisc2)


[Package discSurv version 2.0.0 Index]