R: Prediction from Fitted Piecewise Constant Hazard Models

predict.pch {pch}

R Documentation

Prediction from Fitted Piecewise Constant Hazard Models

Description

This function returns predictions for an object of class “pch”, usually the result of a call to pchreg.

Usage

## S3 method for class 'pch'
predict(object, type = c("distr", "quantile", "sim"), 
   newdata, p, sim.method = c("quantile", "sample"), ...)

Arguments

`object`	a “`pch`” object.
`type`	a character string (just the first letter can be used) indicating the type of prediction. See ‘Details’.
`newdata`	optional data frame in which to look for variables with which to predict. It must include all the covariates that enter the model and, if `type = 'distr'`, also the time variable (see ‘Details’ for additional information of interval-censored data). If `newdata` is omitted, the original data will be used.
`p`	vector of quantiles, to be specified if `type = "quantile"`.
`sim.method`	a character string (just the first letter can be used) indicating the simulation method if `type = "sim"`. Only `sim.method = 'quantile'` is valid with interval-censored data.
`...`	for future methods.

Details

If type = "distr" (the default), this function returns a data frame with columns (haz, Haz, Surv, f) containing the fitted values of the hazard function, the cumulative hazard, the survival function, and the probability density function, respectively.

If type = "quantile", a data frame with the fitted quantiles (corresponding to the supplied values of p) is returned.

If type = "sim", new data are simulated from the fitted model. Two methods are available: with sim.method = "quantile", data are simulated by applying the estimated quantile function to a vector of random uniform numbers; if sim.method = "sample", the quantile function is only used to identify the time interval, and the data are resampled from the observed values in the interval. The second method only works properly if there is a large number of breaks. However, it is less sensitive to model misspecification and facilitates sampling from distributions with a probability mass or non compact support. This method is not applicable to interval-censored data.

Predictions are computed at newdata, if supplied. Note that newdata must include all the variables that are needed for the prediction, and that if type = "distr", new values of the response variable are also required. If the data are interval-censored between time1 and time2, these will not be used as time-to-events and newdata must include a variable 'time' at which to compute predictions.

Value

If type = "distr", a 4-columns data frame with columns (haz, Haz, Surv, f). If type = "quantile", a named data frame with a column for each value of p. If type = "sim", a vector of simulated data.

The presence of missing values in the response or the covariates will always cause the prediction to be NA.

Note

If the data are right-censored, some high quantiles may not be estimated: beyond the last observable quantile, all types of predictions (including type = "sim" with sim.method = "sample") are computed assuming that the hazard remains constant after the last interval.

Author(s)

Paolo Frumento <paolo.frumento@unipi.it>

Examples


  # using simulated data
  
  ##### EXAMPLE 1 - Continuous distribution ############################
  
  n <- 1000
  x <- runif(n)
  time <- rnorm(n, 1 + x, 1 + x) # time-to-event
  cens <- rnorm(n,2,2) # censoring variable
  y <- pmin(time,cens) # observed variable
  d <- (time <= cens) # indicator of the event
  model <- pchreg(Surv(y,d) ~ x, breaks = 20)

  # predicting hazard, cumulative hazard, survival, density

  pred <- predict(model, type = "distr")
  plot(pred$Surv, 1 - pnorm(y, 1 + x, 1 + x)); abline(0,1) 
  # true vs fitted survival
  
  
  # predicting quartiles

  predQ <- predict(model, type = "quantile", p = c(0.25,0.5,0.75))
  plot(x,time)
  points(x, qnorm(0.5, 1 + x, 1 + x), col = "red") # true median
  points(x, predQ$p0.5, col = "green")             # fitted median
  
  
  # simulating new data
  
  tsim1 <- predict(model, type = "sim", sim.method = "quantile")
  tsim2 <- predict(model, type = "sim", sim.method = "sample")

  qt <- quantile(time, (1:9)/10)  # deciles of t
  q1 <- quantile(tsim1, (1:9)/10) # deciles of tsim1
  q2 <- quantile(tsim2, (1:9)/10) # deciles of tsim2

  par(mfrow = c(1,2))
  plot(qt,q1, main = "sim.method = 'quantile'"); abline(0,1)
  plot(qt,q2, main = "sim.method = 'sample'"); abline(0,1)

  # prediction with newdata
  
  predict(model, type = "distr", newdata = data.frame(y = 0, x = 0.5)) # need y!
  predict(model, type = "quantile", p = 0.5, newdata = data.frame(x = 0.5))
  predict(model, type = "sim", sim.method = "sample", newdata = data.frame(x = c(0,1)))

  ##### EXAMPLE 2 - non-compact support ############################
  # to simulate, sim.method = "sample" is recommended ##############
  
  n <- 1000
  t <- c(rnorm(n,-5), rnorm(n,5)) 
  model <- pchreg(Surv(t) ~ 1, breaks = 30)
  
  tsim1 <- predict(model, type = "sim", sim.method = "quantile")
  tsim2 <- predict(model, type = "sim", sim.method = "sample")
  
  par(mfrow = c(1,3))
  hist(t, main = "true distribution")
  hist(tsim1, main = "sim.method = 'quantile'") # the empty spaces are 'filled'
  hist(tsim2, main = "sim.method = 'sample'")   # perfect!

[Package pch version 2.1 Index]