FitSingleMod {DivE}R Documentation

FitSingleMod

Description

Function to fit a model to the diversity values of subsamples of a given sample and its nested samples.

Usage

FitSingleMod(model.list, init.param, param.range,
             main.samp, tot.pop=(100*(DivSampleNum(main.samp,2)[1])),
             numit=10^5, varleft=1e-8, data.default=TRUE,
             subsizes = 6, dssamps = list(), nrf = 1,
             minrarefac=1, NResamples=1000, minplaus=10,
             fitloops=2)

Arguments

model.list

model; written as a function: function(x, params) with(as.list(params), FunctionOfParams). Examples are given in the ModelSet data file as part of the DivE package. Used in the modFit function.

init.param

matrix of of initial seed model parameters. For each matrix, each row represents a given parameter set; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list models. Examples are given in the ParamSeeds data file as part of the DivE package.

param.range

matrix of lower and upper model parameters bounds. Used for the modFit function. The first and second row corresponds to the lower and upper bounds respectively; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list models. Examples are given in the ParamRanges data file as part of the DivE package.

main.samp

the main sample, either as a 2-column data.frame (species ID, count of species), or a vector of species IDs.

tot.pop

total population (integer); default set to 100x the main.samp size.

numit

control argument passed to optimisation routine; the maximum number of iterations that modFit will perform. See modFit for details.

varleft

control argument passed to optimisation routine; see modFit for details.

data.default

if True, then the list of vectors of nested rarefaction data (divsubsample objects) generated by the DivSampleNum and divsubsample functions; if False, then the function uses the user-specified list of nested rarefaction data, dssamps

subsizes

either number of subsamples of main.samp (integer), or a vector of subsample lengths. If the former, then the vector of sample lengths will be created using the DivSampleNum function.

dssamps

list of user specified rarefaction data DivSubsamples objects. The length of each component vector of each object in the list must correspond to the vector of subsample lengths (as defined by the user in subsizes).

nrf

difference between lengths of successive rarefaction datapoints.

minrarefac

minimum rarefaction x-axis value. This argument is not used if list of DivSubsamples object is specified in dssamps.

NResamples

number of resamples used to calculate the rarefaction data. This parameter is not used if list of DivSubsamples object is specified in dssamps. NB: different from numit parameter, which is specific to the fitting process.

minplaus

lower x-axis bound for plausibility check.

fitloops

number of fitting rounds performed for each model. In each round of fitting, the initial seed parameter values for each model will be the fitted parameters of the previous fitting run. This parameter has a significant impact on the computational time. The ‘sweet spot’ is 2.

Details

This function fits a single specified model to the diversity values of the subsamples of a set of nested samples. The output is a list of raw fitting results (pre-scoring). The user should use this function if he or she is interested in fitting a specific parametric rarefraction curve to a sample (rather than selecting the most appropriate model) and examining its performance.

Value

A list of class FitSingleMod containing the results of the fit of the model to the diversity samples. This includes the following:

param

matrix of fitted parameters for each nested sample

ssr

sum-of-squared residuals for the fits for each nested sample

ms

mean sum-of-squared residuals for the fits for each nested sample

discrep

goodness-of-fit values for the fits for each nested sample; this expressed as the average across the subsamples in each nested sample of all the percentage residuals

local

prediction of main sample sizes according to fitted curves for each of the nested samples

global

prediction of population diversity at popsize according to fitted curves for each of the nested subsamples

AccuracyToObserved

vector of percentage errors between the observed diversity of full sample data and the estimated diversity of full sample data from subsamples

subsamplesizes

vector of nested subsample sizes

datapoints

the list of divsubsample objects used in the fitting. The length of the list is equal to number of samples

modelname

name of the model used

numparam

number of parameters in the model

sampvar

the mean squared distances between subsample curves, local and global

mono.local

matrix of logical values: is the curve monotonically increasing, up to the main sample size?

mono.global

matrix of logical values: is the curve monotonically increasing, up to the population size?

slowing.local

matrix of logical values: is the rate of increase in the curve slowing (decreasing second derivative), up to the main sample size?

slowing.global

matrix of logical values: is the rate of increase in the curve slowing (decreasing second derivative), from minplaus to the population size (popsize?

plausibility

matrix of logical values: is the curve plausible (i.e. monotonically increasing and with decreasing second derivative)?

dist.local

matrix of distances between curves fitted to the nested samples. Distances are calculated as areas between curves bounded by 0 and the main sample size

dist.global

similar to dist.local, but with curve upper bound the population size

local.ref.dist

distances of nested curves to the curve fitted to the whole sample, with the curves bounded by 0 and the main sample size

global.ref.dist

similar to local.ref.dist but with curve upper bound the population size

popsize

user defined population size

the model

the function corresponding to the user-selected modelname

Author(s)

Daniel J. Laydon, Aaron Sim, Charles R.M. Bangham, Becca Asquith

References

Laydon, D. J., Melamed, A., Sim, A., Gillet, N. A., Sim, K., Darko, S., Kroll, S., Douek, D. C., Price, D., Bangham, C. R. M., Asquith, B., Quantification of HTLV-1 clonality and TCR diversity, PLOS Comput. Biol. 2014

See Also

ScoreSingleMod

Examples

# See documentation of \code{ScoreSingleMod} for examples

[Package DivE version 1.3 Index]