bce1 {BCE}R Documentation

Bayesian Composition Estimator


This function estimates taxonomic compositions of algal communities based on biomarker field data. More precisely, it estimates the probability distributions of a sample composition based on an input ratio matrix, A that contains prior estimates of biomarker ratios in different taxa, and an input data matrix, B, containing biomarker ratios measured in field samples.

Probability distributions are estimated based on an adaptive metropolis MCMC method, function modMCMC from package FME.


bce1(A, B, Wa=NULL, Wb=NULL,
       jmpType="default", jmpA=.1,jmpX=.1, jmpCovar=NULL,
       initX=NULL, initA=NULL, priorA="normal", minA=NULL, maxA=NULL, 
       var0=NULL, wvar0=1e-6, Xratios=TRUE, verbose=TRUE, ...)



input (group) ratio matrix; can be a matrix or a dataframe


input (field) data matrix; can be a matrix or a dataframe


elementwise weight matrix for A, with the same dimensions as A.


is minimized


elementwise weight matrix for B, with the same dimensions as B.


is minimized


one of "default", "estimate" or "covar"; if default, jmpA and jmpX are the jump lengths. if jmpA or jmpX is a number, then this is the jump length for all elements of A resp. X. If "estimate", the initial jump length is proportional to an estimated covariance matrix for the tlsce fit for A and the lsei fit of X (or Q if Xratios). jmpA and jmpX are then used as rescaling factors for the jump covariance matrix. If "covar", a jump covariance matrix with the correct dimensions, obtained from a previous run, is given as parameter jmpCovar. Covariances can be calculated from the result.


jump length of A: a number or a matrix with dim(A); see details jmpType


jump lenth of X: a number or a matrix with dim(X); see details jmpType


only if jmpType="covar", the covariance matrix to initiate the jumps - see details jmpType


composition matrix used to start the markov chain: default the tlsce solution of Ax=B


ratio matrix used to start the markov chain: default the input ratio matrix A


"normal" (gaussian - default) or "uniform".


minimum values for A


maximum values for A


initial model variance; if 'NULL', then the model variance of tlsce(A,B,...) is used


relative weight of the initial model variance (see modMCMC). Ideally this would be 0 (initial model variance is not taken into account); because wvar0=0 is a special case in modMCMC() (fixed model variance), the default value is set to a small number (wvar0=1e-6)


does the composition matrix contain ratios (TRUE) or estimated biomass concentrations (TRUE) per sample? In the latter case, B must contain the pigment concentrations as measured in the samples (not rescaled)


when TRUE will give more verbose output


arguments to pass on to modMCMC()


The function bce1 searches probability distributions for all elements of a taxonomical composition matrix X and a ratio matrix A for which:

A%*%X ~= B

It does this by returning niter samples for A and X, organized in three-dimensional arrays. The input data matrix B and ratio matrix A should be in the following formats, with the relative concentrations per biomarker organized in columns:

data matrix B:

sample1 sample2 sample3 sample4
marker1 0.14 0.005 0.35 0.033
marker2 0.15 0.004 0.36 0.034
marker3 0.13 0.004 0.31 0.030
marker4 0.13 0.005 0.33 0.031
marker5 0.14 0.008 0.33 0.036
marker6 0.11 0.082 0.34 0.044

and ratio matrix A:

species1 species2 species3 species4
marker1 0.27 0.13 0.35 0.076
marker2 0.084 0 0.5 0.24
marker3 0.195 0.3 0 0.1
marker4 0.06 0 0 0
marker5 0 0 0 0
marker6 0 0 0 0


An object of class bce and _modMCMC_ (returned by the function modMCMC). This object has methods for the generic functions 'summary', 'plot', 'pairs'- see ?modMCMC. It is distinguished from other modMCMC objects by 3 extra attributes that allow to extract matrices A and X from the mcmc result: "dim_A" (dimensions of A), "A_not_null" (which elements of A are not zero and thus included in the mcmc) and Xratios (whether X was rescaled, yes or no).


Producing sensible output:

Markov Chain Monte Carlo simulations are not as straightforward as one might wish; several preliminary runs might be necessary to determine the desired number of iterations, burn-in length and jump length. For all estimated values of Rat and X, their trace (evolution of the values over all iterations) has to display random behaviour; no obvious trends should appear. A few parameters can be tuned to obtain such behaviour:


Karel Van den Meersche <k.vdmeersche@nioo.knaw.nl>, Karline Soetaert <k.soetaert@nioo.knaw.nl>.


Van den Meersche, K., K. Soetaert and J.J. Middelburg (2008) A Bayesian compositional estimator for microbial taxonomy based on biomarkers, Limnology and Oceanography Methods 6, 190-199

See Also

summary.bce, plot.bce, export.bce, pairs.bce



# example using bceInput data
# !!! should be weighted to correspond better to example of BCE!!!
A <- t(bceInput$Rat)
B <- t(bceInput$Dat)

result <- bce1(A,B,niter=1000)

## the number of accepted runs is zero;
## try different starting values

result <- bce1(A,B,niter=1000,initX=matrix(1/ncol(A),ncol(A),ncol(B)))

## number of accepted runs is still low;
## smaller jumps

result <- bce1(A,B,niter=1000,initX=matrix(1/ncol(A),ncol(A),ncol(B)),jmpA=.01,jmpX=.01)
Sum <-summary(result)

## did the algorithm converge?
## no

## more runs, using the output of previous run as input.
result <- bce1(A,B,niter=1e4,jmpA=.01,jmpX=.01,updatecov=1e3,
Sum <-summary(result)

## we inspect the output:
## looks already pretty good; to get a better result, repeat one more
## time with a  longer run. Uncomment the following paragraph and run.
## go get some coffee, this might take a while (~30s).

## result <- bce1(A,B,niter=1e5,jmpA=.01,jmpX=.01,updatecov=1e3,
##                outputlength=1e3,burninlength=.35e5,
##                initX=Sum$lastX,initA=Sum$lastA,
##                jmpCovar=Sum$covar*2.4^2/ncol(result$pars),
##                )
## Sum <-summary(result)
## plot(result$SS,type="l")
## plot(result,ask=TRUE)

# show results as mean with ranges

# plot estimated means and ranges (lbX=lower, ubX=upper bound)
xlim <- range(c(Sum$lbX,Sum$ubX))

# first the mean
         main="Taxonomic composition",
         sub="using bce",pch=16)

# then ranges
nr <- nrow(Sum$meanX)
nc <- ncol(Sum$meanX)

for (i in 1:nr) 
  {ip <-(nr-i)*(nc+2)+1
   cc <- ip : (ip+nc-1)

[Package BCE version 2.1 Index]