assess_pb_bias_correction {rubias} | R Documentation |
Test the effects of the parametric bootstrap bias correction on a reference dataset through cross-validation
Description
This is a rewrite of bias_comparison(). Eric didn't want the plotting to be wrapped up in a function, and wanted to return a more informative data frame.
Usage
assess_pb_bias_correction(
reference,
gen_start_col,
seed = 5,
nreps = 50,
mixsize = 100,
alle_freq_prior = list(const_scaled = 1)
)
Arguments
reference |
a two-column format genetic dataset, with a "repunit" column specifying each individual's reporting unit of origin, a "collection" column specifying the collection (population or time of sampling) and "indiv" providing a unique name |
gen_start_col |
the first column containing genetic data in |
seed |
the random seed for simulations |
nreps |
The number of reps to do. |
mixsize |
The size of each simulated mixture sample. |
alle_freq_prior |
a one-element named list specifying the prior to be used when
generating Dirichlet parameters for genotype likelihood calculations. Valid methods include
|
Details
Takes a reference two-column genetic dataset, pulls a series of random "mixture" datasets with varying reporting unit proportions from this reference, and compares the results of GSI through standard MCMC vs. parametric-bootstrap MCMC bias correction
The amount of bias in reporting unit proportion calculations increases with the rate of misassignment between reporting units (decreases with genetic differentiation), and increases as the number of collections within reporting units becomes more uneven.
Output from the standard Bayesian MCMC method demonstrates the level of bias to be expected for the input data set, and parametric bootstrapping is an empirical method for the removal of any existing bias.
Value
bias_comparison
returns a list; the first element is
a list of the relevant rho values generated on each iteration of the random "mixture"
creation. This includes the true rho value, the standard result rho_mcmc
,
and the parametric bootstrapped rho_pb
.
The second element is a dataframe listing summary statistics for each
reporting unit and estimation method. mse
, the mean squared error, summarizes
the deviation of the rho estimates from their true value, including both bias and other variance.
mean_prop_bias
is the average ratio of residual to true value, which gives greater
weight to deviations at smaller values. mean_bias
is simply the average residual;
unlike mse
, this demonstrates the direction of the bias.
Examples
## Not run:
## This takes too long to run in R CMD CHECK
ale_bias <- assess_pb_bias_correction(alewife, 17)
## End(Not run)