make_newdata {pammtools} | R Documentation |
Construct a data frame suitable for prediction
Description
This functions provides a flexible interface to create a data set that
can be plugged in as newdata
argument to a suitable predict
function (or similar).
The function is particularly useful in combination with one of the
add_*
functions, e.g., add_term
,
add_hazard
, etc.
Usage
make_newdata(x, ...)
## Default S3 method:
make_newdata(x, ...)
## S3 method for class 'ped'
make_newdata(x, ...)
## S3 method for class 'fped'
make_newdata(x, ...)
Arguments
x |
A data frame (or object that inherits from |
... |
Covariate specifications (expressions) that will be evaluated
by looking for variables in |
Details
Depending on the type of variables in x
, mean or modus values
will be used for variables not specified in ellipsis
(see also sample_info
). If x
is an object
that inherits from class ped
, useful data set completion will be
attempted depending on variables specified in ellipsis. This is especially
useful, when creating a data set with different time points, e.g. to
calculate survival probabilities over time (add_surv_prob
)
or to calculate a time-varying covariate effects (add_term
).
To do so, the time variable has to be specified in ...
, e.g.,
tend = seq_range(tend, 20)
. The problem with this specification is that
not all values produced by seq_range(tend, 20)
will be actual values
of tend
used at the stage of estimation (and in general, it will
often be tedious to specify exact tend
values). make_newdata
therefore finds the correct interval and sets tend
to the respective
interval endpoint. For example, if the intervals of the PED object are
(0,1], (1,2]
then tend = 1.5
will be set to 2
and the
remaining time-varying information (e.g. offset) completed accordingly.
See examples below.
Examples
# General functionality
tumor %>% make_newdata()
tumor %>% make_newdata(age=c(50))
tumor %>% make_newdata(days=seq_range(days, 3), age=c(50, 55))
tumor %>% make_newdata(days=seq_range(days, 3), status=unique(status), age=c(50, 55))
# mean/modus values of unspecified variables are calculated over whole data
tumor %>% make_newdata(sex=unique(sex))
tumor %>% group_by(sex) %>% make_newdata()
# Examples for PED data
ped <- tumor %>% slice(1:3) %>% as_ped(Surv(days, status)~., cut = c(0, 500, 1000))
ped %>% make_newdata(age=c(50, 55))
# if time information is specified, other time variables will be specified
# accordingly and offset calculated correctly
ped %>% make_newdata(tend = c(1000), age = c(50, 55))
ped %>% make_newdata(tend = unique(tend))
ped %>% group_by(sex) %>% make_newdata(tend = unique(tend))
# tend is set to the end point of respective interval:
ped <- tumor %>% as_ped(Surv(days, status)~.)
seq_range(ped$tend, 3)
make_newdata(ped, tend = seq_range(tend, 3))