tram-methods {tram}R Documentation

Methods for Stratified Linear Transformation Models

Description

Methods for objects inheriting from class tram

Usage

## S3 method for class 'tram'
as.mlt(object)
## S3 method for class 'tram'
model.frame(formula, ...)
## S3 method for class 'tram'
model.matrix(object, data = object$data, with_baseline = FALSE, ...) 
## S3 method for class 'stram'
model.matrix(object, data = object$data, with_baseline = FALSE, 
       what = c("shifting", "scaling"), ...) 
## S3 method for class 'tram'
coef(object, with_baseline = FALSE, ...) 
## S3 method for class 'Lm'
coef(object, as.lm = FALSE, ...)
## S3 method for class 'Survreg'
coef(object, as.survreg = FALSE, ...)
## S3 method for class 'tram'
vcov(object, with_baseline = FALSE, complete = FALSE, ...) 
## S3 method for class 'tram'
logLik(object, parm = coef(as.mlt(object), fixed = FALSE), ...)
## S3 method for class 'tram'
estfun(x, parm = coef(as.mlt(x), fixed = FALSE), ...)
## S3 method for class 'tram'
predict(object, newdata = model.frame(object), 
        type = c("lp", "trafo", "distribution", "logdistribution", 
             "survivor", "logsurvivor", "density", "logdensity", 
             "hazard", "loghazard", "cumhazard", "logcumhazard", 
             "odds", "logodds", "quantile"), ...) 
## S3 method for class 'stram'
predict(object, newdata = model.frame(object), 
        type = c("lp", "trafo", "distribution", "logdistribution", 
             "survivor", "logsurvivor", "density", "logdensity", 
             "hazard", "loghazard", "cumhazard", "logcumhazard", 
             "odds", "logodds", "quantile"), 
        what = c("shifting", "scaling"), ...)
## S3 method for class 'tram'
plot(x, newdata = model.frame(x), 
     which = c("QQ-PIT", "baseline only", "distribution"), 
     confidence = c("none", "interval", "band"), level = 0.95, 
     K = 50, cheat = K, col = "black", fill = "lightgrey", lwd = 1, ...)
## S3 method for class 'tram'
residuals(object, ...)
## S3 method for class 'tram'
PI(object, newdata = model.frame(object), reference = 0,
                  one2one = FALSE, ...)
## Default S3 method:
PI(object, prob, link = "logistic", ...)
## S3 method for class 'tram'
OVL(object, newdata = model.frame(object), reference = 0,
                  one2one = FALSE, ...)
## Default S3 method:
OVL(object, link = "logistic", ...)
## S3 method for class 'tram'
TV(object, newdata = model.frame(object), reference = 0,
                  one2one = FALSE, ...)
## Default S3 method:
TV(object, link = "logistic", ...)
## S3 method for class 'tram'
L1(object, newdata = model.frame(object), reference = 0,
                  one2one = FALSE, ...)
## Default S3 method:
L1(object, link = "logistic", ...)
## S3 method for class 'tram'
ROC(object, newdata = model.frame(object), reference = 0,
                   prob = 1:99 / 100, one2one = FALSE, ...)
## Default S3 method:
ROC(object, prob = 1:99 / 100, link = "logistic", ...)
## S3 method for class 'ROCtram'
plot(x, lty = 1:ncol(x), col = "black", 
     fill = "lightgrey", lwd = 1, ...) 

Arguments

object, formula, x

a fitted stratified linear transformation model inheriting from class tram. PI also takes a numeric vector in the default method.

data

an optional data frame.

with_baseline

logical, if TRUE all model parameters are returned, otherwise parameters describing the baseline transformation are ignored.

as.lm

logical, return parameters in the lm parameterisation if TRUE.

as.survreg

logical, return parameters in the survreg parameterisation in TRUE.

parm

model parameters, including baseline parameters.

complete

currently ignored

newdata

an optional data frame of new observations.

reference

an optional data frame of reference observations, or a numeric vector of reference values.

type

type of prediction, current options include linear predictors ("lp", of x variables in the formula y | s ~ x), transformation functions ("trafo") or distribution functions on the scale of the cdf ("distribution"), survivor function, density function, log-density function, hazard function, log-hazard function, cumulative hazard function or quantile function.

which

type of plot, either a QQ plot of the probability-integral transformed observations ("QQ-PIT"), of the baseline transformation of the whole distribution.

what

type of model matrix / linear predictor: shifting returns model model matrix / linear predictor for shift term, scaling for the scale term.

confidence

type of uncertainty assessment.

level

confidence level.

K

number of grid points in the response, see plot.ctm.

cheat

reduced number of grid points for the computation of confidence bands, see confband.

col

line color.

fill

fill color.

lwd

line width.

lty

line type.

prob

a numeric vector of probabilities..

link

a character identifying a link function.

one2one

logical, compute the ROC curve (and derived measures) comparing each row in newdata with each row in reference (FALSE, the default), or compare observations rowwise (TRUE).

...

additional arguments to the underlying methods for class mlt, see mlt-methods.

Details

coef can be used to get (and set) model parameters, logLik evaluates the log-likelihood (also for parameters other than the maximum likelihood estimate); vcov returns the estimated variance-covariance matrix (possibly taking cluster into account) and and estfun gives the score contribution by each observation. predict and plot can be used to inspect the model on different scales.

PI computes the probabilistic index (or concordance probability or AUC) for all observations in newdata, relative to reference, ie the probability

P(Y_1 \le Y_0 \mid x_0, x_1)

of observing a smaller value of a randomly sampled observation conditional on x_1 compared to a randomly sampled reference observation, which is conditional on x_0. This is equivalent to the area under the receiver operating curve (ROC). The probability only applies within strata, response-varying coefficients are not allowed.

Under the same setup, OVL gives the overlap coefficient, which is one minus the total variation and one minus half the L_1 distance between the two conditional densities. The overlap coefficient is identical to the Youden index and the Smirnov statistic.

PI and friends also accept an argument conf.level which triggers computation of simultaneous Wald confidence intervals for these measures. Arguments in ... are forwarded to glht.

References

Torsten Hothorn, Lisa Moest, Peter Buehlmann (2018), Most Likely Transformations, Scandinavian Journal of Statistics, 45(1), 110–134, doi:10.1111/sjos.12291.

See Also

mlt-methods, plot.ctm

Examples


    data("BostonHousing2", package = "mlbench")

    ### fit non-normal Box-Cox type linear model with two
    ### baseline functions (for houses near and off Charles River)
    BC_BH_2 <- BoxCox(cmedv | 0 + chas ~ crim + zn + indus + nox + 
                      rm + age + dis + rad + tax + ptratio + b + lstat,
                      data = BostonHousing2)
    logLik(BC_BH_2)

    ### classical likelihood inference
    summary(BC_BH_2)

    ### coefficients of the linear predictor
    coef(BC_BH_2)

    ### plot linear predictor (mean of _transformed_ response) 
    ### vs. observed values
    plot(predict(BC_BH_2, type = "lp"), BostonHousing2$cmedv)

    ### all coefficients
    coef(BC_BH_2, with_baseline = TRUE)

    ### compute predicted median along with 10% and 90% quantile for the first
    ### observations
    predict(BC_BH_2, newdata = BostonHousing2[1:3,], type = "quantile",
            prob = c(.1, .5, .9))

    ### plot the predicted density for these observations
    plot(BC_BH_2, newdata = BostonHousing2[1:3, -1],
         which = "distribution", type = "density", K = 1000)

    ### evaluate the two baseline transformations, with confidence intervals
    nd <- model.frame(BC_BH_2)[1:2, -1]
    nd$chas <- factor(c("0", "1"))
    library("colorspace")
    col <- diverge_hcl(2, h = c(246, 40), c = 96, l = c(65, 90))
    fill <- diverge_hcl(2, h = c(246, 40), c = 96, l = c(65, 90), alpha = .3)
    plot(BC_BH_2, which = "baseline only", newdata = nd, col = col,
         confidence = "interval", fill = fill, lwd = 2,
         xlab = "Median Value", ylab = expression(h[Y]))
    legend("bottomright", lty = 1, col = col, 
            title = "Near Charles River", legend = c("no", "yes"), bty = "n")

    ### cars data; with quantile functions
    plot(dist ~ speed, data = cars)
    m <- Colr(dist ~ speed, data = cars)
    q <- predict(as.mlt(m), newdata = data.frame(speed = s <- 6:25),
                 type = "quantile", prob = c(1, 5, 9) / 10)
    lines(s, q[1,])
    lines(s, q[2,])
    lines(s, q[3,])

    nd <- data.frame(speed = s <- as.double(1:5 * 5))
    
    # Prob(dist at speed s > dist at speed 0)
    # speed 0 is reference, not a good choice here
    PI(m, newdata = nd)

    # Prob(dist at speed s > dist at speed 15)
    lp15 <- c(predict(m, newdata = data.frame(speed = 15)))
    PI(m, newdata = nd, reference = lp15)
    PI(m, newdata = nd, reference = nd[3,,drop = FALSE])

    # Prob(dist at speed s' > dist at speed s)
    PI(m, newdata = nd, reference = nd)
    # essentially:
    lp <- predict(m, newdata = nd)
    PI(object = dist(lp))
    # same, with simultaneous confidence intervals
    PI(m, newdata = nd, reference = nd, conf.level = .95)

    # plot ROC curves + confidence bands
    # compare speed 20 and 25 to speed 15
    plot(ROC(m, newdata = nd[4:5,,drop = FALSE],
             reference = nd[3,,drop = FALSE],
             conf.level = 0.95))

    # Overlap of conditional densities at speed s' and s
    OVL(m, newdata = nd, reference = nd)

    ### ROC analysis (takes too long for CRAN Windows)
    if (require("mlbench") && .Platform$OS.type != "windows") {

        layout(matrix(1:4, nrow = 2))
        data("PimaIndiansDiabetes2", package = "mlbench")
        dia <- sort(unique(PimaIndiansDiabetes2$diabetes))
        nd <- data.frame(diabetes = dia, 
                         age = 29, mass = 32) ### median values

        ### unconditional ROC analysis: glucose tolerance test
        m0 <- Colr(glucose ~ diabetes, data = PimaIndiansDiabetes2)
        # ROC curve + confidence band
        plot(ROC(m0, newdata = nd[2,,drop = FALSE], conf.level = .95)) 
        # Wald interval for AUC
        PI(m0, newdata = nd[2,,drop = FALSE], conf.level = .95)
        # score interval for AUC
        PI(-c(coef(m0), score_test(m0)$conf.int[2:1]))

        ### adjusted ROC analysis for age and mass
        m1 <- Colr(glucose ~ diabetes + age + mass, data = PimaIndiansDiabetes2)
        # ROC curve + confidence band (this is the same for all ages /
        # masses)
        plot(ROC(m1, newdata = nd[2,,drop = FALSE], 
                     reference = nd[1,,drop = FALSE], 
                 conf.level = .95))
        # Wald interval for adjusted AUC
        PI(m1, newdata = nd[2,,drop = FALSE], reference = nd[1,,drop = FALSE], 
           conf.level = .95)
        # Score interval for adjusted AUC
        PI(-c(coef(m1)[1], score_test(m1, names(coef(m1))[1])$conf.int[2:1]))

        ### conditional ROC analysis: AUC regression ~ age + mass
        m2 <- Colr(glucose ~ diabetes * (age + mass), data = PimaIndiansDiabetes2)
        # ROC curve for a person with age = 29 and mass = 32
        plot(ROC(m2, newdata = nd[2,,drop = FALSE], 
                     reference = nd[1,,drop = FALSE], 
                 conf.level = .95))
        # AUC for persons ages 21:81, all with mass = 32
        nd1 <- data.frame(diabetes = nd[1,"diabetes"], age = 21:81, mass = 32)
        nd2 <- data.frame(diabetes = nd[2,"diabetes"], age = 21:81, mass = 32)
        auc <- PI(m2, newdata = nd2, reference = nd1, one2one = TRUE,
                  conf.level = 0.95)
        plot(nd1$age, auc[, "Estimate"], xlab = "Age (in years)", ylab =
             "AUC", ylim = c(0, 1), type = "l")
        lines(nd1$age, auc[, "lwr"], lty = 3)
        lines(nd1$age, auc[, "upr"], lty = 3)
    }

[Package tram version 1.0-4 Index]