infiniteSum_batches {sumR}R Documentation

Approximates the sum of a positive discrete infinite series with a single maximum using the batches algorithm

Description

A simple method to perform the summation. It adds the values in batches and stops when the accumulated batch is smaller than the desired threshold. There is an implementation purely in R and one in C. The one in R is usually slightly faster due to vectorized computing.

Usage

infiniteSum_batches(
  logFunction,
  parameters = numeric(),
  batch_size = 40,
  epsilon = 1e-15,
  maxIter = 1e+05,
  n0 = 0
)

infiniteSum_batches_C(
  logFunction,
  parameters = numeric(),
  batch_size = 40,
  epsilon = 1e-15,
  maxIter = 1e+05,
  n0 = 0
)

Arguments

logFunction

The function that returns the series value an in the log scale. Can either be an R function or a string indicating one of the pre-coded functions. See precompiled() for a list of available functions. If defined in R, the function's definition must have two arguments. The first argument must be the integer argument equivalent to n in an and the second must be a vector of numeric parameters.

parameters

A numeric vector with parameters used in logFunction. Vectorized summation over various parameter values sets is not implemented. Use apply() or their variants to achieve this.

batch_size

The batch size at which point convergence checking is performed. The algorithm perform at least twice this number of function evaluations. See 'details'.

epsilon

The desired error margin for the approximation. See 'details'.

maxIter

The maximum number of iterations for the approximation. In most cases, this number will not be reached unless it is very small.

n0

The sum will be approximated for the series starting at this value.

Details

The series an must pass the ratio convergence test, meaning that the ratio an+1/an must converge to a number L < 1 when n goes to infinity.

The batches algorithm consists of evaluating the function a fixed number of times for two checkpoints. If the difference between the sum at these checkpoints is smaller than epsilon, the code stops and the later checkpoint sum is returned. Else, continue summing until the next checkpoint. All checkpoints are batch_size long.

This function's efficiency is reliant on the choice of batch_size. If it is set too large, the algorithm overshoots the necessary number of function evaluations too much. If it is set too small, the algorithm will need to process too many partial summations which slows it down. However, if they are well calibrated for the series, they can potentially be very efficient.

Since the batch sizes are known before the calculations are made, function evaluations can be vectorized. This is why there are two functions available. infiniteSum_batches does the calculations at the R level, while infiniteSum_batches_C interfaces the low level C code. However, the C code does not use vectorization since it isn't available on long double precision type, and therefore the R level function should be faster in most cases.

Another difference is that the low level code uses double precision for the calculations. This means that it is less prone to rounding errors. But this also means that the two functions can sometimes require a different number of iterations and function evaluations to reach the stop criteria. This is shown in the examples.

Another requirement in the current installment of this function is that the series must have only a single maximum. This is the case for most discrete probability distributions and marginalization problems. This limitation will be addressed in the future.

Value

A summed-objects() object.

See Also

precompiled() provides a list with precompiled functions that can be used for the summation. infiniteSum() is a more efficient algorithm.

Examples

## Define some function that is known to pass the ratio test.
param = 0.1
funfun <- function(k, p) return(k * log1p(-p[1]))
result <- infiniteSum_batches(funfun, parameters = param)

## This series is easy to verify analytically
TrueSum = -log(param)
TrueSum - result$sum
# Notice that it required 400 function evaluations for the approximation.
result$n

# If we use the C function, it reaches a lower error, but requires more
# iterations
result_C <- infiniteSum_batches_C(funfun, parameters = param)
TrueSum - result_C$sum
result_C$n

## A common problem is finding the normalizing constant for the
## Conway-Maxwell-Poisson distribution. It has already been included
## in the precompiled list of functions.
comp_params = c(lambda = 5, nu = 3)
result <- infiniteSum_batches("COMP", comp_params)
# With a specifically chosen argument value, the summation can be done with
# fewer iterations. But it is usually hard to know the ideal choice for
# applications beforehand
result$n
infiniteSum_batches("COMP", comp_params, batch_size = 11)$n
# A small batch_size ensures a small number of iterations, but slows the
# method due to multiple checking.
infiniteSum_batches("COMP", comp_params, batch_size = 2)$n

[Package sumR version 0.4.15 Index]