R: Number-at-risk in raw and weighted data

get.numAtRisk {causalCmprsk}

R Documentation

Number-at-risk in raw and weighted data

Description

Obtaining time-varying number-at-risk statistic in both raw and weighted data

Usage

get.numAtRisk(df, X, E, A, C = NULL, wtype = "unadj", cens = 0)

Arguments

`df`	a data frame that includes time-to-event `X`, type of event `E`, a treatment indicator `A` and covariates `C`.
`X`	a character string specifying the name of the time-to-event variable in `df`.
`E`	a character string specifying the name of the "event type" variable in `df`.
`A`	a character specifying the name of the treatment/exposure variable. It is assumed that `A` is a numeric binary indicator with 0/1 values, where `A`=1 is assumed a treatment group, and `A`=0 a control group.
`C`	a vector of character strings with variable names (potential confounders) in the logistic regression model for Propensity Scores, i.e. P(A=1\|C=c). The default value of `C` is NULL corresponding to `wtype`="unadj" that will estimate treatment effects in the raw (observed) data.
`wtype`	a character string variable indicating the type of weights that will define the target population for which the ATE will be estimated. The default is "unadj" - this will not adjust for possible treatment selection bias and will not use propensity scores weighting. It can be used, for example, in data from a randized controlled trial (RCT) where there is no need for emulation of baseline randomization. Other possible values are "stab.ATE", "ATE", "ATT", "ATC" and "overlap". See Table 1 from Li, Morgan, and Zaslavsky (2018). "stab.ATE" is defined as P(A=a)/P(A=a\|C=c) - see Hernán et al. (2000).
`cens`	an integer value in `E` that corresponds to censoring times recorded in `X`. By default `fit.nonpar` assumes `cens`=0

Value

A list with two fields:

trt.0 a matrix with three columns, time, num and sample corresponding to the treatment arm with A=0. The results for both weighted and unadjusted number-at-risk are returnd in a long-format matrix. The column time is a vector of time points at which we calculate the number-at-risk. The column num is the number-at-risk. The column sample is a factor variable that gets one of two values, "Weighted" or "Unadjusted". The estimated number-at-risk in the weighted sample corresponds to the rows with sample="Weighted".
trt.1 a matrix with three columns, time, num and sample corresponding to the treatment arm with A=1. The results for both weighted and unadjusted number-at-risk are returnd in a long-format matrix. The column time is a vector of time points at which we calculate the number-at-risk. The column num is the number-at-risk. The column sample is a factor variable that gets one of two values, "Weighted" or "Unadjusted". The estimated number-at-risk in the weighted sample corresponds to the rows with sample="Weighted".

Examples

# create a data set
n <- 1000
set.seed(7)
c1 <- runif(n)
c2 <- as.numeric(runif(n)< 0.2)
set.seed(77)
cf.m.T1 <- rweibull(n, shape=1, scale=exp(-(-1 + 2*c1)))
cf.m.T2 <-  rweibull(n, shape=1, scale=exp(-(1 + 1*c2)))
cf.m.T <- pmin( cf.m.T1, cf.m.T2)
cf.m.E <- rep(0, n)
cf.m.E[cf.m.T1<=cf.m.T2] <- 1
cf.m.E[cf.m.T2<cf.m.T1] <- 2
set.seed(77)
cf.s.T1 <- rweibull(n, shape=1, scale=exp(-1*c1 ))
cf.s.T2 <-  rweibull(n, shape=1, scale=exp(-2*c2))
cf.s.T <- pmin( cf.s.T1, cf.s.T2)
cf.s.E <- rep(0, n)
cf.s.E[cf.s.T1<=cf.s.T2] <- 1
cf.s.E[cf.s.T2<cf.s.T1] <- 2
exp.z <- exp(0.5 + 1*c1 - 1*c2)
pr <- exp.z/(1+exp.z)
TRT <- ifelse(runif(n)< pr, 1, 0)
X <- ifelse(TRT==1, cf.m.T, cf.s.T)
E <- ifelse(TRT==1, cf.m.E, cf.s.E)
covs.names <- c("c1", "c2")
data <- data.frame(X=X, E=E, TRT=TRT, c1=c1, c2=c2)

num.atrisk <- get.numAtRisk(data, "X", "E", "TRT", C=covs.names, wtype="overlap", cens=0)
plot(num.atrisk$trt.1$time[num.atrisk$trt.1$sample=="Weighted"],
     num.atrisk$trt.1$num[num.atrisk$trt.1$sample=="Weighted"], col="red", type="s",
     xlab="time", ylab="number at risk",
     main="Number at risk in TRT=1", ylim=c(0, max(num.atrisk$trt.1$num)))
lines(num.atrisk$trt.1$time[num.atrisk$trt.1$sample=="Unadjusted"],
      num.atrisk$trt.1$num[num.atrisk$trt.1$sample=="Unadjusted"], col="blue", type="s")
legend("topright", legend=c("Weighted", "Unadjusted"), lty=1:1,  col=c("red", "blue"))
plot(num.atrisk$trt.0$time[num.atrisk$trt.0$sample=="Weighted"],
     num.atrisk$trt.0$num[num.atrisk$trt.0$sample=="Weighted"], col="red", type="s",
     xlab="time", ylab="number at risk",
     main="Number at risk in TRT=0", ylim=c(0, max(num.atrisk$trt.0$num)))
lines(num.atrisk$trt.0$time[num.atrisk$trt.0$sample=="Unadjusted"],
      num.atrisk$trt.0$num[num.atrisk$trt.0$sample=="Unadjusted"], col="blue", type="s")
legend("topright", legend=c("Weighted", "Unadjusted"), lty=1:1,  col=c("red", "blue"))

[Package causalCmprsk version 2.0.0 Index]

Number-at-risk in raw and weighted data

Description

Usage

Arguments

Value

See Also

Examples