R: Natural splines - (cubic splines linear beyond outermost...

Ns {Epi}

R Documentation

Natural splines - (cubic splines linear beyond outermost knots) with convenient specification of knots and possibility of centering, detrending and clamping.

Description

This function is partly for convenient specification of natural splines in practical modeling. The convention used is to take the smallest and the largest of the supplied knots as boundary knots. It also has the option of centering the effects provided at a chosen reference point as well as projecting the columns on the orthogonal space to that spanned by the intercept and the linear effect of the variable, and finally fixing slopes beyond boundary knots (clamping).

Usage

Ns( x, ref = NULL, df = NULL,
                knots = NULL,
            intercept = FALSE,
       Boundary.knots = NULL,
                fixsl = c(FALSE,FALSE),
              detrend = FALSE )

Arguments

`x`	A variable.
`ref`	Scalar. Reference point on the `x`-scale, where the resulting effect will be 0.
`df`	degrees of freedom.
`knots`	knots to be used both as boundary and internal knots. If `Boundary.knots` are given, this will be taken as the set of internal knots.
`intercept`	Should the intercept be included in the resulting basis? Ignored if any of `ref` or `detrend` is given.
`Boundary.knots`	The boundary knots beyond which the spline is linear. Defaults to the minimum and maximum of `knots`.
`fixsl`	Specification of whether slopes beyond outer knots should be fixed to 0. `FALSE` correponds to no restriction; a curve with 0 slope beyond the upper knot is obtained using `c(FALSE,TRUE)`. Ignored if `!(detrend==FALSE)`.
`detrend`	If `TRUE`, the columns of the spline basis will be projected to the orthogonal of `cbind(1,x)`. Optionally `detrend` can be given as a vector of non-negative numbers og length `length(x)`, used to define an inner product as `diag(detrend)` for projection on the orthogonal to `cbind(1,x)`. The default is projection w.r.t. the inner product defined by the identity matrix.

Value

A matrix of dimension c(length(x),df) where either df was supplied or if knots were supplied, df = length(knots) - 1 + intercept. Ns returns a spline basis which is centered at ref. Ns with the argument detrend=TRUE returns a spline basis which is orthogonal to cbind(1,x) with respect to the inner product defined by the positive definite matrix diag(detrend) (an assumption which is checked). Note the latter is data dependent and therefore making predictions with a newdata argument will be senseless.

Note

The need for this function is primarily from analysis of rates in epidemiology and demography, where the dataset are time-split records of follow-up, and the range of data therefore rarely is of any interest (let alone meaningful).

In Poisson modeling of rates based on time-split records one should aim at having the same number of events between knots, rather than the same number of observations.

Author(s)

Bendix Carstensen b@bxc.dk, Lars Jorge D\'iaz, Steno Diabetes Center Copenhagen.

Examples

require(splines)
require(stats)
require(graphics)

ns( women$height, df = 3)
Ns( women$height, knots=c(63,59,71,67) )

# Gives the same results as ns:
summary( lm(weight ~ ns(height, df = 3), data = women) )
summary( lm(weight ~ Ns(height, df = 3), data = women) )

# Get the diabetes data and set up as Lexis object
data(DMlate)
DMlate <- DMlate[sample(1:nrow(DMlate),500),]
dml <- Lexis( entry = list(Per=dodm, Age=dodm-dobth, DMdur=0 ),
               exit = list(Per=dox),
        exit.status = factor(!is.na(dodth),labels=c("DM","Dead")),
               data = DMlate )

# Split follow-up in 1-year age intervals
dms <- splitLexis( dml, time.scale="Age", breaks=0:100 )
summary( dms )

# Model  age-specific rates using Ns with 6 knots
# and period-specific RRs around 2000 with 4 knots
# with the same number of deaths between each pair of knots
n.kn <- 6
( a.kn <- with( subset(dms,lex.Xst=="Dead"),
                quantile( Age+lex.dur, probs=(1:n.kn-0.5)/n.kn ) ) )
n.kn <- 4
( p.kn <- with( subset( dms, lex.Xst=="Dead" ),
                quantile( Per+lex.dur, probs=(1:n.kn-0.5)/n.kn ) ) )
m1 <- glm( lex.Xst=="Dead" ~ Ns( Age, kn=a.kn ) +
                             Ns( Per, kn=p.kn, ref=2000 ),
           offset = log( lex.dur ),
           family = poisson,
             data = dms )

# Plot estimated age-mortality curve for the year 2005 and knots chosen:
nd <- data.frame( Age=seq(40,100,0.1), Per=2005, lex.dur=1000 )
par( mfrow=c(1,2) )
matplot( nd$Age, ci.pred( m1, newdata=nd ),
         type="l", lwd=c(3,1,1), lty=1, col="black", log="y",
         ylab="Mortality rates per 1000 PY", xlab="Age (years)", las=1, ylim=c(1,1000) )
rug( a.kn, lwd=2 )

# Clamped Age effect to the right of rightmost knot.
m1.c <- glm( lex.Xst=="Dead" ~ Ns( Age, kn=a.kn, fixsl=c(FALSE,TRUE) ) +
                               Ns( Per, kn=p.kn, ref=2000 ),
             offset = log( lex.dur ),
             family = poisson,
               data = dms )

# Plot estimated age-mortality curve for the year 2005 and knots chosen.
matplot( nd$Age, ci.pred( m1.c, newdata=nd ),
         type="l", lwd=c(3,1,1), lty=1, col="black", log="y",
         ylab="Mortality rates per 1000 PY", xlab="Age (years)", las=1, ylim=c(1,1000) )
rug( a.kn, lwd=2 )

par( mfrow=c(1,1) )

# Including a linear Age effect of 0.05 to the right of rightmost knot.
m1.l <- glm( lex.Xst=="Dead" ~ Ns( Age, kn=a.kn, fixsl=c(FALSE,TRUE) ) +
                               Ns( Per, kn=p.kn, ref=2000 ),
             offset = log( lex.dur ) + pmax( Age, max( a.kn ) ) * 0.05,
             family = poisson,
               data = dms )

# Plot estimated age-mortality curve for the year 2005 and knots chosen.
nd <- data.frame(Age=40:100,Per=2005,lex.dur=1000)
matplot( nd$Age, ci.pred( m1.l, newdata=nd ),
         type="l", lwd=c(3,1,1), lty=1, col="black", log="y",
         ylab="Mortality rates per 1000 PY", xlab="Age (years)", las=1, ylim=c(1,1000) )
rug( a.kn, lwd=2 )

[Package Epi version 2.53 Index]