DiscretizeData {NPHazardRate} | R Documentation |
Discretize the available data set
Description
Defines equispaced disjoint intervals based on the range of the sample and calculates empirical hazard rate estimates at each interval center
Usage
DiscretizeData(xin, xout)
Arguments
xin |
A vector of input values |
xout |
Grid points where the function will be evaluated |
Details
The function defines the subinterval length \Delta = (0.8\max(X_i) - \min(X_i))/N
where N
is the sample size. Then at each bin (subinterval) center, the empirical hazard rate estimate is calculated by
c_i = \frac{f_i}{\Delta(N-F_i +1) }
where f_i
is the frequency of observations in the ith bin and F_i = \sum_{j\leq i} f_j
is the empirical cummulative distribution estimate.
Value
A vector with the values of the function at the designated points xout or the random numbers drawn.
Examples
x<-seq(0, 5,length=100) #design points where the estimate will be calculated
SampleSize<-100 #amount of data to be generated
ti<- rweibull(SampleSize, .6, 1) # draw a random sample
ui<-rexp(SampleSize, .2) # censoring sample
cat("\n AMOUNT OF CENSORING: ", length(which(ti>ui))/length(ti)*100, "\n")
x1<-pmin(ti,ui) # observed data
cen<-rep.int(1, SampleSize) # initialize censoring indicators
cen[which(ti>ui)]<-0 # 0's correspond to censored indicators
a.use<-DiscretizeData(ti, x) # discretize the data
BinCenters<-a.use$BinCenters # get the data centers
ci<-a.use$ci # get empircal hazard rate estimates
Delta=a.use$Delta # Binning range
[Package NPHazardRate version 0.1 Index]