R: Calculates point of degeneration j0 into noise of the Idata,...

compute.stream {TopKLists}

R Documentation

Calculates point of degeneration j0 into noise of the Idata, applying moderate deviation-based inference

Description

The estimation of \hat{j}_0 is achieved via a moderate deviation-based approach. The probability that an estimator, computed from a pilot sample size \nu, exceeds a value z, the deviation above z is said to be a moderate deviation if its associated probability is polynomially small as a function of \nu, and to be a large deviation if the probability is exponentially small in \nu. The values of z=z_\nu that are associated with moderate deviations are z_\nu\equiv\bigl(C\,\nu^{-1}\,\log\nu\bigr)^{1/2}, where C>\frac{1}{4}. The null hypothesis that p_k=\frac{1}{2} for \nu consecutive values of k, versus the alternative hypothesis that p_k>\frac{1}{2} for at least one of the values of k, is rejected when \hat{p}_j^\pm-\frac{1}{2}>z_\nu. The probabilities \hat{p}_j^+ and \hat{p}_j^- are estimates of p_j computed from the \nu data pairs I_\ell for which \ell lies immediately to the right of j, or immediately to the left of j, respectively.

The iterative algorithm consists of an ordered sequence of "test stages" s_1, s_2,\ldots In stage s_k an integer J_{s_k} is estimated, which is a potential lower bound to j_0 (when k is odd), or a potential upper bound to j_0 (when k is even).

Usage

compute.stream(Idata, const=0.251, v, r=1.2)

Arguments

`Idata`	Input data is a vector of 0s and 1s (see `prepare.idata`)
`const`	Denotes the constant C of the moderate deviation bound, needs to be larger than 0.25 (default is 0.251)
`v`	Denotes the pilot sample size `\nu` related to the degree of randomness in the assignments. In each step the noise is estimated from the Idata as probability of 1 within the interval of size `\nu`, moving from `J_{s_{k-1}} -r \nu` if `k` is odd or `J_{s_{k-1}} +r \nu` if `k` is even, until convergence or break (see `r`)
`r`	Denotes a technical constant determining the starting point from which the probability for `I=1` is estimated in a window of size `v` (see `v`, default is 1.2)

Value

A named list containing:

`j0_est`	Is the estimated index for which the `Idata` degenerate into noise
`k`	`k=j0_est-1`
`reason.break`	The reason why the computation has ended - convergence or break condition
`js`	Is the sequence of estimated `j_0` in each iteration run, also showing the convergence behaviour
`v`	Is the preselected value of the parameter `\nu`

Author(s)

Eva Budinska <budinska@iba.muni.cz>, Michael G. Schimek <michael.schimek@medunigraz.at>

Examples

set.seed(465)
myhead <- rbinom(20, 1, 0.8)
mytail <- rbinom(20, 1, 0.5)
mydata <- c(myhead, mytail)
compute.stream(mydata, v=10)

[Package TopKLists version 1.0.8 Index]