R: Approximates the sum of a positive discrete infinite series...

infiniteSum {sumR}

R Documentation

Approximates the sum of a positive discrete infinite series with a single maximum

Description

For series that pass the ratio test, the approximation is analytically guaranteed to have an error that is smaller than epsilon. This can occasionally not happen due to floating point arithmetic.

Usage

infiniteSum(
  logFunction,
  parameters = numeric(),
  logL = NULL,
  alternate = FALSE,
  epsilon = 1e-15,
  maxIter = 1e+05,
  n0 = 0,
  forceAlgorithm = 0
)

Arguments

`logFunction`	The function that returns the series absolute value \|a_n\| in the log scale. If it is an alternating series, this is defined in argument `alternate`. Can either be an `R` function or a string indicating one of the precompiled functions. See `precompiled()` for a list of available functions. If defined in `R`, the function's definition must have two arguments. The first argument must be the integer argument equivalent to `n` in a_n and the second must be a vector of numeric parameters.
`parameters`	A numeric vector with parameters used in `logFunction`. Vectorized summation over various parameter values sets is not implemented. Use `apply()` or their variants to achieve this.
`logL`	The log of the limit value of a_n+1/a_n which must be smaller than 1, or smaller than 0 in the log scale. Ignored if the series is alternating, defined with argument `alternate`. If left as `NULL` and `logFunction` is defined in `R`, the `batches` algorithm with default settings is used. See 'details'.
`alternate`	Either -1, 0 or 1. If 0 (or `FALSE`), the series is not alternating and positive. Otherwise, the series is alternating where the first element's sign is either 1 or -1, as entered in this parameter. If not 0, arguments `logL` and `forceAlgorithm` are ignored.
`epsilon`	The desired error margin for the approximation. See 'details'.
`maxIter`	The maximum number of iterations for the approximation. In most cases, this number will not be reached unless it is very small. A value too high is not recommended as an array of this size is reserved in memory during the algorithm.
`n0`	The sum will be approximated for the series starting at this value for the first parameter of `logFunction`.
`forceAlgorithm`	A value to control which summation algorithm to use. Ignored if the series is alternating, defined with argument `alternate`. See 'details'.

Details

The approximated sum is based on some theoretical results which, analytically, guarantee that the approximation will be within epsilon distance to the true value. It is possible that the numerical result fails to fall in this distance due to floating point arithmetic. The C code under the hood is being continuously reviewed to minimize this problem. They seem to occur more often when the series decays very fast to zero or when the total is a large number.

For these theoretical results to work, the series must pass the ratio test, which means that the ratio a_n+1/a_n must converge to a number L < 1 when n goes to infinity. The log of L should be provided to the function for a better approximation. This is not necessary in case a precompiled function is used. In this case the value of L is coded into the package.

Another requirement in the current installment of this function is that the series must have only a single maximum. This is the case for most discrete probability distributions and marginalization problems. This limitation will be addressed in the future.

There are currently two implemented algorithms that perform the calculations. The first, called Sum-To-Threshold, sums the series values until the series values are smaller than epsilon. This is the fastest algorithm, but it is only guaranteed to provide an approximation within the desired error margin when L < 0.5.

The second algorithm, called Error bounding pairs is based on a more general result which works for any 0 \le L < 1. This algorithm sums the series until

|a_n+1/(1-L) - a_n+1 a_n/(a_n - a_n+1)| < 2 ε.

Then the approximation is the added values of the sum plus

0.5 (a_n+1/(1-L) + a_n+1 a_n/(a_n - a_n+1))

The Error bounding pairs method usually requires less function evaluations than the Sum-To-Threshold one, however the convergence checking is more demanding, which means that it is typically slower, albeit slightly. If L = 0, the convergence checking can be reduced and the Error bounding pairs becomes almost as fast as the Sum-To-Threshold method.

The third algorithm is called batches method and is used when L is left at NULL. Its use requires some fine tuning, so there is a standalone function for it called infiniteSum_batches(). Its use and functionality can be seen in its own documentation. When called as a result of this function, default settings are used.

The forceAlgorithm parameter can be used to control which algorithm to use. When it is 0, the program automatically selects the Sum-to-threshold when L < 0.5 and the Error bounding pairs when 0.5 \le L < 1. Method batches is selected when L is left NULL. If forceAlgorithm is set to 1, the Sum-To-Threshold algorithm is forced. If it is 2, then the Error bounding pairs is forced. A small note, the Error bounding pairs algorithm can go up to maxIter + 1 function evaluations. This is due to its convergence checking dependence on a_n+1. Finally, if the parameter is set as 3, the batches algorithm is used with default settings.

If the series is alternating, the Sum-To-Threshold convergence condition on the series absolute value guarantees the result, regardless of the ratio limit.

The function to be summed can be an R function or a string naming the precompiled function in the package. The list of precompiled functions can be found in precompiled(), and more functions will be added in time. As is intuitive, using a precompiled function is much faster than using an R function. In fact, it has been observed to be dozens times faster.

The advanced user can program their own precompiled functions and use the package's summation algorithms by linking the appropriate header file. See the GitHub readme for the a quick tutorial.

Value

A summed-objects() object.

Examples

## Define some function that is known to pass the ratio test.
param <- 0.1
funfun <- function(k, p) return(k * log1p(-p[1]))
result <- infiniteSum(funfun, parameters = param, logL = log1p(-param))

## This series is easy to verify analytically
TrueSum <- -log(param)
TrueSum - result$sum
# Since exp(logL) = 0.9, the Error bounding pairs
# algorithm is used. Notice that it only required
# 2 function evaluations for the approximation, that is
result$n

## A common problem is finding the normalizing constant for the
## Conway-Maxwell-Poisson distribution. It has already been included
## in the precompiled list of functions.
comp_params <- c(lambda = 5, nu = 3)
result <- infiniteSum("COMP", comp_params)
result

[Package sumR version 0.4.15 Index]