compute.stream {TopKLists}R Documentation

Calculates point of degeneration j0 into noise of the Idata, applying moderate deviation-based inference

Description

The estimation of j^0\hat{j}_0 is achieved via a moderate deviation-based approach. The probability that an estimator, computed from a pilot sample size ν\nu, exceeds a value z, the deviation above z is said to be a moderate deviation if its associated probability is polynomially small as a function of ν\nu, and to be a large deviation if the probability is exponentially small in ν\nu. The values of z=zνz=z_\nu that are associated with moderate deviations are zν(Cν1logν)1/2z_\nu\equiv\bigl(C\,\nu^{-1}\,\log\nu\bigr)^{1/2}, where C>14C>\frac{1}{4}. The null hypothesis that pk=12p_k=\frac{1}{2} for ν\nu consecutive values of k, versus the alternative hypothesis that pk>12p_k>\frac{1}{2} for at least one of the values of k, is rejected when p^j±12>zν\hat{p}_j^\pm-\frac{1}{2}>z_\nu. The probabilities p^j+\hat{p}_j^+ and p^j\hat{p}_j^- are estimates of pjp_j computed from the ν\nu data pairs II_\ell for which \ell lies immediately to the right of j, or immediately to the left of j, respectively.

The iterative algorithm consists of an ordered sequence of "test stages" s1,s2,s_1, s_2,\ldots In stage sks_k an integer JskJ_{s_k} is estimated, which is a potential lower bound to j0j_0 (when kk is odd), or a potential upper bound to j0j_0 (when kk is even).

Usage

compute.stream(Idata, const=0.251, v, r=1.2)

Arguments

Idata

Input data is a vector of 0s and 1s (see prepare.idata)

const

Denotes the constant C of the moderate deviation bound, needs to be larger than 0.25 (default is 0.251)

v

Denotes the pilot sample size ν\nu related to the degree of randomness in the assignments. In each step the noise is estimated from the Idata as probability of 1 within the interval of size ν\nu, moving from Jsk1rνJ_{s_{k-1}} -r \nu if kk is odd or Jsk1+rνJ_{s_{k-1}} +r \nu if kk is even, until convergence or break (see r)

r

Denotes a technical constant determining the starting point from which the probability for I=1I=1 is estimated in a window of size v (see v, default is 1.2)

Value

A named list containing:

j0_est

Is the estimated index for which the Idata degenerate into noise

k

k=j0est1k=j0_est-1

reason.break

The reason why the computation has ended - convergence or break condition

js

Is the sequence of estimated j0j_0 in each iteration run, also showing the convergence behaviour

v

Is the preselected value of the parameter ν\nu

Author(s)

Eva Budinska <budinska@iba.muni.cz>, Michael G. Schimek <michael.schimek@medunigraz.at>

See Also

prepare.idata

Examples

set.seed(465)
myhead <- rbinom(20, 1, 0.8)
mytail <- rbinom(20, 1, 0.5)
mydata <- c(myhead, mytail)
compute.stream(mydata, v=10)	

[Package TopKLists version 1.0.8 Index]