R: Pointwise Confidence Intervals under Current Status data

ComputeConfIntervals {curstatCI}

R Documentation

Pointwise Confidence Intervals under Current Status data

Description

The function ComputeConfIntervals computes pointwise confidence intervals for the distribution function under current status data. The confidence intervals are based on the Smoothed Maximum likelihood Estimator and constructed using the nonparametric bootstrap.

Usage

ComputeConfIntervals(data, x, alpha, bw)

Arguments

`data`	Dataframe with three variables: t Observation points t sorted in ascending order. All observations need to be positive. The total number of unique observation points equals `length(t)`. freq1 Frequency of observation t satisfying `x \le t`. The total number of observations with censoring indicator `\delta =1` equals `sum(freq1)`. freq2 Frequency of observation t. The sample size equals `sum(freq2)`. If no tied observations are present in the data `length(t)` equals `sum(freq2)`.
`x`	numeric vector containing the points where the confidence intervals are computed. This vector needs to be contained within the observation interval: `t[1] < min(x) \le max(x) < t[n]`.
`alpha`	confidence level of pointwise confidence intervals.
`bw`	numeric vector of size `length(x)`. This vector contains the pointwise bandwidth values for each point in the vector x.

Details

In the current status model, the variable of interest X with distribution function F is not observed directly. A censoring variable T is observed instead together with the indicator \Delta = (X \le T). ComputeConfIntervals computes the pointwise 1-alpha bootstrap confidence intervals around the SMLE of F based on a sample of size n <- sum(data$freq2).

The bandwidth parameter vector that minimizes the pointwise Mean Squared Error using the subsampling principle in combination with undersmoothing is returned by the function ComputeBW.

The default method for constructing the confidence intervals in [Groeneboom & Hendrickx (2017)] is based on estimating the asymptotic variance of the SMLE. When the bandwidth is small for some point in x, the variance estimate of the SMLE at this point might not exist. If this happens the Non-Studentized confidence interval is returned for this particular point in x.

Value

List with 5 variables:

MLE: Maximum Likelihood Estimator. This is a matrix of dimension (m+1)x2 where m is the number of jump points of the MLE. The first column consists of the point zero and the jump locations of the MLE. The second column contains the value zero and the values of the MLE at the jump points.
SMLE: Smoothed Maximum Likelihood Estimator. This is a vector of size length(x) containing the values of the SMLE for each point in the vector x.
CI: pointwise confidence interval. This is a matrix of dimension length(x)x2. The first resp. second column contains the lower resp. upper values of the confidence intervals for each point in x.
Studentized: points in x for which Studentized nonparametric bootstrap confidence intervals are computed.
NonStudentized: points in x for which classical nonparametric bootstrap confidence intervals are computed.

References

Groeneboom, P. and Hendrickx, K. (2017). The nonparametric bootstrap for the current status model. Electronic Journal of Statistics 11(2):3446-3848.

Examples

library(Rcpp)
library(curstatCI)

# sample size
n <- 1000

# Uniform data  U(0,2)
set.seed(2)
y <- runif(n,0,2)
t <- runif(n,0,2)
delta <- as.numeric(y <= t)

A<-cbind(t[order(t)], delta[order(t)], rep(1,n))

# x vector
grid<-seq(0.1,1.9 ,by = 0.1)

# data-driven bandwidth vector
bw <- ComputeBW(data =A, x = grid)

# pointwise confidence intervals at grid points:
out<-ComputeConfIntervals(data = A,x =grid,alpha = 0.05, bw = bw)

left <- out$CI[,1]
right <- out$CI[,2]

plot(grid, out$SMLE,type ='l', ylim=c(0,1), main= "",ylab="",xlab="",las=1)
points(grid, left, col = 4)
points(grid, right, col = 4)
segments(grid,left, grid, right)

[Package curstatCI version 0.1.1 Index]