R: Estimate interval model accounting for missed arrival...

estinterval {intRvals}

R Documentation

Estimate interval model accounting for missed arrival observations

Description

Estimate interval mean and variance accounting for missed arrival observations, by fitting the probability density function intervalpdf to the interval data.

Usage

estinterval(
  data,
  mu = median(data),
  sigma = sd(data)/2,
  p = 0.2,
  N = 5L,
  fun = "gamma",
  trunc = c(0, Inf),
  fpp = (if (fpp.method == "fixed") 0 else 0.1),
  fpp.method = "auto",
  p.method = "auto",
  conf.level = 0.9,
  group = NA,
  sigma.within = NA,
  iter = 10,
  tol = 0.001,
  silent = F,
  ...
)

Arguments

`data`	A numeric list of intervals.
`mu`	Start value for the numeric optimization for the mean arrival interval.
`sigma`	Start value for the numeric optimization for the standard deviation of the arrival interval.
`p`	Start value for the numeric optimization for the probability to not observe an arrival.
`N`	Maximum number of missed observations to be taken into account (default N=5).
`fun`	Assumed distribution for the intervals, one of "`normal`" or "`gamma`", corresponding to the Normal and GammaDist distributions
`trunc`	Use a truncated probability density function with range `trunc`
`fpp`	Baseline proportion of intervals distributed as a random poisson process with mean arrival interval `mu`
`fpp.method`	A string equal to 'fixed' or 'auto'. When 'auto' `fpp` is optimized as a free model parameter, in which case `fpp` is taken as start value in the optimisation
`p.method`	A string equal to 'fixed' or 'auto'. When 'auto' `p` is optimized as a free model parameter, in which case `p` is taken as start value in the optimisation
`conf.level`	Confidence level for deviance test that checks whether model with nonzero missed event probability `p` significantly outperforms a model without a missed event probability (`p=0`).
`group`	optional vector of equal length as data, indicating the group or subject in which the interval was observed
`sigma.within`	optional within-subject standard deviation. When equal to default 'NA', assumes no additional between-subject effect, with `sigma.within` equal to `sigma`. When equal to 'auto' an estimate is provided by iteratively calling partition
`iter`	maximum number of iterations in numerical iteration for `sigma.within`
`tol`	tolerance in the iteration, when `sigma.within` changes less than this value in one iteration step, the optimization is considered converged.
`silent`	logical. When `TRUE` print no information to console
`...`	Additional arguments to be passed to optim

Details

The probability density function for observed intervals intervalpdf is fit to data by maximization of the associated log-likelihood using optim.

Within-group variation sigma.within may be separated from the total variation sigma in an iterative fit of intervalpdf on the interval data. In the iteration partition is used to (1) determine which intervals according to the fit are a fundamental interval at a confidence level conf.level, and (2) to partition the within-group variation from the total variation in interval length.

Within- and between-group variation is estimated on the subset of fundamental intervals with repeated measures only. As the set of fundamental interval depends on the precise value of sigma.within, the fit of intervalpdf and the subsequent estimation of sigma.within using partition is iterated until both converge to a stable solution. Parameters tol and iter set the threshold for convergence and the maximum number of iterations.

We note that an exponential interval model can be fitted by setting fpp=1 and fpp.method=fixed.

Value

This function returns an object of class intRvals, which is a list containing the following:

data: the interval data
mu: the modelled mean interval
mu.se: the modelled mean interval standard error
sigma: the modelled interval standard deviation
p: the modelled probability to not observe an arrival
fpp: the modelled fraction of arrivals following a random poisson process, see intervalpdf
N: the highest number of consecutive missed arrivals taken into account, see intervalpdf
convergence: convergence field of optim
counts: counts field of optim
loglik: vector of length 2, with first element the log-likelihood of the fitted model, and second element the log-likelihood of the model without a missed event probability (i.e. p=0)
df.residual: degrees of freedom, a 2-vector (1, number of intervals - n.param)
n.param: number of optimized model parameters
p.chisq: p value for a likelihood-ratio test of a model including a miss probability relative against a model without a miss probability
distribution: assumed interval distribution, one of 'gamma' or 'normal'
trunc: interval range over which the interval pdf was truncated and normalized
fpp.method: A string equal to 'fixed' or 'auto'. When 'auto' fpp has been optimized as a free model parameter
p.method: A string equal to 'fixed' or 'auto'. When 'auto' p has been optimized as a free model parameter

Examples

data(goosedrop)
# calculate mean and standard deviation of arrival intervals, accounting for missed observations:
dr=estinterval(goosedrop$interval)
# plot some summary information
summary(dr)
# plot a histogram of the intervals and fit:
plot(dr)
# test whether the mean arrival interval is greater than 200 seconds:
ttest(dr,mu=200,alternative="greater")

# let's estimate mean and variance of dropping intervals by site
# (schiermonnikoog vs terschelling) for time period 5.
# first prepare the two datasets:
set1=goosedrop[goosedrop$site=="schiermonnikoog" & goosedrop$period == 5,]
set2=goosedrop[goosedrop$site=="terschelling"  & goosedrop$period == 5,]
# allowing a fraction of intervals to be distributed randomly (fpp='auto')
dr1=estinterval(set1$interval,fpp.method='auto')
dr2=estinterval(set2$interval,fpp.method='auto')
# plot the fits:
plot(dr1,xlim=c(0,1000))
plot(dr2,xlim=c(0,1000))
# mean dropping interval are not significantly different
# at the two sites (on a 0.95 confidence level):
ttest(dr1,dr2)
# now compare this test with a t-test not accounting for unobserved intervals:
t.test(set1$interval,set2$interval)
# not accounting for missed observations leads to a (spurious)
# larger difference in means, which also increases
# the apparent statistical significance of the difference between means

[Package intRvals version 1.0.1 Index]