emplikH.disc {emplik} | R Documentation |
Empirical likelihood ratio for discrete hazard with right censored, left truncated data
Description
Use empirical likelihood ratio and Wilks theorem to test the null hypothesis that
\sum_i[f(x_i, \theta) \log(1- dH(x_i))] = K
where H(t)
is the (unknown) discrete cumulative
hazard function; f(t,\theta)
can be any predictable
function of t
.
\theta
is the parameter of the function
and K
is a given constant.
The data can be right censored and left truncated.
When the given constants \theta
and/or K
are too far
away from the NPMLE, there will be no hazard function satisfy this
constraint and the minus 2Log empirical likelihood ratio
will be infinite. In this case the computation will stop.
Usage
emplikH.disc(x, d, y= -Inf, K, fun, tola=.Machine$double.eps^.25, theta)
Arguments
x |
a vector, the observed survival times. |
d |
a vector, the censoring indicators, 1-uncensor; 0-censor. |
y |
optional vector, the left truncation times. |
K |
a real number used in the constraint, sum to this value. |
fun |
a left continuous (weight) function used to calculate
the weighted discrete hazard in |
tola |
an optional positive real number specifying the tolerance of iteration error in solve the non-linear equation needed in constrained maximization. |
theta |
a given real number used as the parameter of the
function |
Details
The log likelihood been maximized is the ‘binomial’ empirical likelihood:
\sum D_i \log w_i + (R_i-D_i) \log [1-w_i]
where w_i = \Delta H(t_i)
is the jump
of the cumulative hazard function, D_i
is the number of failures
observed at t_i
, R_i
is the number of subjects at risk at
time t_i
.
For discrete distributions, the jump size of the cumulative hazard at
the last jump is always 1. We have to exclude this jump from the
summation since \log( 1- dH(\cdot))
do not make sense.
The constants theta
and K
must be inside the so called
feasible region for the computation to continue. This is similar to the
requirement that in testing the value of the mean, the value must be
inside the convex hull of the observations.
It is always true that the NPMLE values are feasible. So when the
computation stops, try move the theta
and K
closer
to the NPMLE. When the computation stops, the -2LLR should have value
infinite.
In case you do not need the theta
in the definition of the
function f
, you still need to formally define your fun
function
with a theta
input, just to match the arguments.
Value
A list with the following components:
times |
the location of the hazard jumps. |
wts |
the jump size of hazard function at those locations. |
lambda |
the final value of the Lagrange multiplier. |
"-2LLR" |
The discrete -2Log Likelihood ratio. |
Pval |
P-value |
niters |
number of iterations used |
Author(s)
Mai Zhou
References
Fang, H. (2000). Binomial Empirical Likelihood Ratio Method in Survival Analysis. Ph.D. Thesis, Univ. of Kentucky, Dept of Statistics.
Zhou and Fang (2001). “Empirical likelihood ratio for 2 sample problem for censored data”. Tech Report, Univ. of Kentucky, Dept of Statistics
Zhou, M. and Fang, H. (2006). A comparison of Poisson and binomial empirical likelihood. Tech Report, Univ. of Kentucky, Dept of Statistics
Examples
fun4 <- function(x, theta) { as.numeric(x <= theta) }
x <- c(1, 2, 3, 4, 5, 6, 5, 4, 3, 4, 1, 2.4, 4.5)
d <- c(1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1)
# test if -H(4) = -0.7
emplikH.disc(x=x,d=d,K=-0.7,fun=fun4,theta=4)
# we should get "-2LLR" 0.1446316 etc....
y <- c(-2,-2, -2, 1.5, -1)
emplikH.disc(x=x,d=d,y=y,K=-0.7,fun=fun4,theta=4)