R: FitSingleMod

FitSingleMod {DivE}

R Documentation

FitSingleMod

Description

Function to fit a model to the diversity values of subsamples of a given sample and its nested samples.

Usage

FitSingleMod(model.list, init.param, param.range,
             main.samp, tot.pop=(100*(DivSampleNum(main.samp,2)[1])),
             numit=10^5, varleft=1e-8, data.default=TRUE,
             subsizes = 6, dssamps = list(), nrf = 1,
             minrarefac=1, NResamples=1000, minplaus=10,
             fitloops=2)

Arguments

`model.list`	model; written as a function: function(x, params) with(as.list(params), FunctionOfParams). Examples are given in the ModelSet data file as part of the DivE package. Used in the modFit function.
`init.param`	matrix of of initial seed model parameters. For each matrix, each row represents a given parameter set; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list `models`. Examples are given in the ParamSeeds data file as part of the DivE package.
`param.range`	matrix of lower and upper model parameters bounds. Used for the modFit function. The first and second row corresponds to the lower and upper bounds respectively; each column represents a parameter value. Column names must match parameter names (params) in the corresponding model in the list `models`. Examples are given in the ParamRanges data file as part of the DivE package.
`main.samp`	the main sample, either as a 2-column data.frame (species ID, count of species), or a vector of species IDs.
`tot.pop`	total population (integer); default set to 100x the `main.samp` size.
`numit`	control argument passed to optimisation routine; the maximum number of iterations that modFit will perform. See `modFit` for details.
`varleft`	control argument passed to optimisation routine; see `modFit` for details.
`data.default`	if `True`, then the list of vectors of nested rarefaction data (divsubsample objects) generated by the DivSampleNum and divsubsample functions; if `False`, then the function uses the user-specified list of nested rarefaction data, dssamps
`subsizes`	either number of subsamples of main.samp (integer), or a vector of subsample lengths. If the former, then the vector of sample lengths will be created using the DivSampleNum function.
`dssamps`	list of user specified rarefaction data DivSubsamples objects. The length of each component vector of each object in the list must correspond to the vector of subsample lengths (as defined by the user in `subsizes`).
`nrf`	difference between lengths of successive rarefaction datapoints.
`minrarefac`	minimum rarefaction x-axis value. This argument is not used if list of DivSubsamples object is specified in dssamps.
`NResamples`	number of resamples used to calculate the rarefaction data. This parameter is not used if list of DivSubsamples object is specified in `dssamps`. NB: different from `numit` parameter, which is specific to the fitting process.
`minplaus`	lower x-axis bound for plausibility check.
`fitloops`	number of fitting rounds performed for each model. In each round of fitting, the initial seed parameter values for each model will be the fitted parameters of the previous fitting run. This parameter has a significant impact on the computational time. The ‘sweet spot’ is 2.

Details

This function fits a single specified model to the diversity values of the subsamples of a set of nested samples. The output is a list of raw fitting results (pre-scoring). The user should use this function if he or she is interested in fitting a specific parametric rarefraction curve to a sample (rather than selecting the most appropriate model) and examining its performance.

Value

A list of class FitSingleMod containing the results of the fit of the model to the diversity samples. This includes the following:

`param`	matrix of fitted parameters for each nested sample
`ssr`	sum-of-squared residuals for the fits for each nested sample
`ms`	mean sum-of-squared residuals for the fits for each nested sample
`discrep`	goodness-of-fit values for the fits for each nested sample; this expressed as the average across the subsamples in each nested sample of all the percentage residuals
`local`	prediction of main sample sizes according to fitted curves for each of the nested samples
`global`	prediction of population diversity at `popsize` according to fitted curves for each of the nested subsamples
`AccuracyToObserved`	vector of percentage errors between the observed diversity of full sample data and the estimated diversity of full sample data from subsamples
`subsamplesizes`	vector of nested subsample sizes
`datapoints`	the list of divsubsample objects used in the fitting. The length of the list is equal to number of samples
`modelname`	name of the model used
`numparam`	number of parameters in the model
`sampvar`	the mean squared distances between subsample curves, local and global
`mono.local`	matrix of logical values: is the curve monotonically increasing, up to the main sample size?
`mono.global`	matrix of logical values: is the curve monotonically increasing, up to the population size?
`slowing.local`	matrix of logical values: is the rate of increase in the curve slowing (decreasing second derivative), up to the main sample size?
`slowing.global`	matrix of logical values: is the rate of increase in the curve slowing (decreasing second derivative), from `minplaus` to the population size (`popsize`?
`plausibility`	matrix of logical values: is the curve plausible (i.e. monotonically increasing and with decreasing second derivative)?
`dist.local`	matrix of distances between curves fitted to the nested samples. Distances are calculated as areas between curves bounded by 0 and the main sample size
`dist.global`	similar to dist.local, but with curve upper bound the population size
`local.ref.dist`	distances of nested curves to the curve fitted to the whole sample, with the curves bounded by 0 and the main sample size
`global.ref.dist`	similar to local.ref.dist but with curve upper bound the population size
`popsize`	user defined population size
`the model`	the function corresponding to the user-selected `modelname`

Author(s)

Daniel J. Laydon, Aaron Sim, Charles R.M. Bangham, Becca Asquith

References

Laydon, D. J., Melamed, A., Sim, A., Gillet, N. A., Sim, K., Darko, S., Kroll, S., Douek, D. C., Price, D., Bangham, C. R. M., Asquith, B., Quantification of HTLV-1 clonality and TCR diversity, PLOS Comput. Biol. 2014

Examples

# See documentation of \code{ScoreSingleMod} for examples

[Package DivE version 1.3 Index]