predict.pch {pch} | R Documentation |
Prediction from Fitted Piecewise Constant Hazard Models
Description
This function returns predictions for an object of class “pch
”, usually the result of a call
to pchreg
.
Usage
## S3 method for class 'pch'
predict(object, type = c("distr", "quantile", "sim"),
newdata, p, sim.method = c("quantile", "sample"), ...)
Arguments
object |
a “ |
type |
a character string (just the first letter can be used) indicating the type of prediction. See ‘Details’. |
newdata |
optional data frame in which to look for variables with which to predict. It must include all the covariates that enter the model and, if |
p |
vector of quantiles, to be specified if |
sim.method |
a character string (just the first letter can be used) indicating the simulation method if |
... |
for future methods. |
Details
If type = "distr"
(the default), this function returns a data frame with columns (haz, Haz, Surv, f)
containing the fitted values of the hazard function, the cumulative hazard, the survival function, and
the probability density function, respectively.
If type = "quantile"
, a data frame with the fitted quantiles (corresponding to the supplied
values of p
) is returned.
If type = "sim"
, new data are simulated from the fitted model. Two methods are available:
with sim.method = "quantile"
, data are simulated by applying the estimated quantile function
to a vector of random uniform numbers; if sim.method = "sample"
, the quantile function is only used to identify the time interval, and the data are resampled from the observed values in the interval.
The second method only works properly if there is a large number of breaks. However, it is less sensitive to
model misspecification and facilitates sampling from distributions with a probability mass or non compact support. This method is not applicable to interval-censored data.
Predictions are computed at newdata
, if supplied. Note that newdata
must include all the variables that are needed for the prediction, and that if type = "distr"
,
new values of the response variable are also required. If the data are interval-censored between time1
and time2
, these will not be used as time-to-events and newdata
must include
a variable 'time'
at which to compute predictions.
Value
If type = "distr"
, a 4-columns data frame with columns (haz, Haz, Surv, f)
.
If type = "quantile"
, a named data frame with a column for each value of p
.
If type = "sim"
, a vector of simulated data.
The presence of missing values in the response or the covariates will always cause the prediction to be NA
.
Note
If the data are right-censored, some high quantiles may not be estimated: beyond the last observable quantile,
all types of predictions (including type = "sim"
with sim.method = "sample"
) are
computed assuming that the hazard remains constant after the last interval.
Author(s)
Paolo Frumento <paolo.frumento@unipi.it>
See Also
Examples
# using simulated data
##### EXAMPLE 1 - Continuous distribution ############################
n <- 1000
x <- runif(n)
time <- rnorm(n, 1 + x, 1 + x) # time-to-event
cens <- rnorm(n,2,2) # censoring variable
y <- pmin(time,cens) # observed variable
d <- (time <= cens) # indicator of the event
model <- pchreg(Surv(y,d) ~ x, breaks = 20)
# predicting hazard, cumulative hazard, survival, density
pred <- predict(model, type = "distr")
plot(pred$Surv, 1 - pnorm(y, 1 + x, 1 + x)); abline(0,1)
# true vs fitted survival
# predicting quartiles
predQ <- predict(model, type = "quantile", p = c(0.25,0.5,0.75))
plot(x,time)
points(x, qnorm(0.5, 1 + x, 1 + x), col = "red") # true median
points(x, predQ$p0.5, col = "green") # fitted median
# simulating new data
tsim1 <- predict(model, type = "sim", sim.method = "quantile")
tsim2 <- predict(model, type = "sim", sim.method = "sample")
qt <- quantile(time, (1:9)/10) # deciles of t
q1 <- quantile(tsim1, (1:9)/10) # deciles of tsim1
q2 <- quantile(tsim2, (1:9)/10) # deciles of tsim2
par(mfrow = c(1,2))
plot(qt,q1, main = "sim.method = 'quantile'"); abline(0,1)
plot(qt,q2, main = "sim.method = 'sample'"); abline(0,1)
# prediction with newdata
predict(model, type = "distr", newdata = data.frame(y = 0, x = 0.5)) # need y!
predict(model, type = "quantile", p = 0.5, newdata = data.frame(x = 0.5))
predict(model, type = "sim", sim.method = "sample", newdata = data.frame(x = c(0,1)))
##### EXAMPLE 2 - non-compact support ############################
# to simulate, sim.method = "sample" is recommended ##############
n <- 1000
t <- c(rnorm(n,-5), rnorm(n,5))
model <- pchreg(Surv(t) ~ 1, breaks = 30)
tsim1 <- predict(model, type = "sim", sim.method = "quantile")
tsim2 <- predict(model, type = "sim", sim.method = "sample")
par(mfrow = c(1,3))
hist(t, main = "true distribution")
hist(tsim1, main = "sim.method = 'quantile'") # the empty spaces are 'filled'
hist(tsim2, main = "sim.method = 'sample'") # perfect!