freqpcr {freqpcr} | R Documentation |
Estimate population allele frequency from the set of Cq measurements.
Description
The function estimates the population allele frequency using the dataset of Cq values measured over length(N)
bulk samples, each of which has a sample size of N[i]
as the number of individuals included. N[i]
can be 1, meaning that every individual is analyzed separately. For the ith sample, the four Cq values were measured as housek0[i]
, target0[i]
, housek1[i]
, and target1[i]
. The function can estimate up to five parameters simultaneously when the Cq sets are available for more than two (bulk) samples: P
, K
, targetScale
, sdMeasure
, and EPCR
.
Since v0.3.2, user can also use an experimental ‘continuous model’ by specifying A
instead of N
. That is, each sample DNA is directly extracted from the environment and the sample allele ratio y
follows y ~ Beta(apk, a(1-p)k)
instead of y ~ Beta(mk, (n-m)k), m ~ Binomial(n, p)
, where p
and k
are the population allele frequency and the gamma shape parameter of the individual DNA yield, respectively. Each element of A
, a
is a scaling factor of relative DNA contents between the samples. The continuous model is likely when each sample directly comes from the environment e.g., water sampling in an eDNA analysis or cell culture in a petri dish.
Since v0.4.0, freqpcr()
also works without specifying housek0
and target0
i.e., it can estimate population allele frequency from \Delta
Cq values instead of \Delta\Delta
Cq. In this setting, the sizes of targetScale
and sdMeasure
should be fixed in addition to EPCR
and zeroAmount
.
Usage
freqpcr(
N,
A,
housek0,
target0,
housek1,
target1,
P = NULL,
K = NULL,
targetScale = NULL,
sdMeasure = NULL,
EPCR = 0.99,
XInit0 = c(P = NULL, K = NULL, targetScale = NULL, sdMeasure = NULL, EPCR = NULL),
zeroAmount = NULL,
beta = TRUE,
diploid = FALSE,
pvalue = 0.05,
gradtol = 1e-04,
steptol = 1e-09,
iterlim = 100,
maxtime = 600,
print.level = 1,
...
)
Arguments
N |
Sample sizes as a numeric vector. |
A |
Use instead of |
housek0 |
A numeric vector. In RED- |
target0 |
In RED- |
housek1 |
The Cq values of the test sample measured on the housekeeping gene after the restriction enzyme digestion (in RED- |
target1 |
For each test sample with unknown allele-ratio, |
P |
Scalar. Population allele frequency from which the test samples are derived. Default is |
K |
Scalar. The gamma shape parameter of the individual DNA yield. Default is |
targetScale |
( |
sdMeasure |
( |
EPCR |
( |
XInit0 |
Optionally the initial value for the parameter optimization can be specified, but it is strongly recommended to keep the argument as is. Unlike |
zeroAmount |
(In RED- |
beta |
Whether to use the beta distribution to approximate the sample allele ratio instead of specifying individual gamma distribution for each of the allelic DNA amounts? Default is |
diploid |
Is the target organism diploidy? Default is |
pvalue |
The two-sided confidence interval is calculated at the last iteration at given significance level. Default is 0.05, which returns the 95% Wald's CI (2.5 to 97.5 percentile) based on the Hessian matrix. |
gradtol , steptol , iterlim |
Control parameters passed to |
maxtime |
A positive scalar to set the maximum calculation time in seconds to abort the optimizer (and return error). The total calculation time largely depends on |
print.level |
|
... |
Additional arguments passed to the function. |
Value
Object of the S4 class CqFreq. The slot report
is a matrix and each row contains the estimated parameter value with 100*(1-pvalue)% confidence interval. The following parameters are returned:
P
, the population allele frequency from which the test samples are derived.K
, the gamma shape parameter of the individual DNA yield.targetScale
(\delta_{T}
), the relative template DNA amount of the target to the houskeeping loci.EPCR
(\eta
), the amplification efficiency per PCR cycle.sdMeasure
or "Cq measurement error" (\sigma_{c}
).
Choise of the parameters to be estimated
Estimation is conducted only for parameters where the values are not specified or specified explicitly as NULL
. If one feeds a value e.g. K=1
or sdMeasure=0.24
, it is then treated as fixed parameter. Notwithstanding, EPCR
is estimated only when EPCR = NULL
is specified explicitly.
You must verify the size of EPCR
and zeroAmount
beforehand because they are not estimable from the experiments with unknown allele ratios. Although targetScale
and sdMeasure
are estimable within freqpcr()
, it is better to feed the known values, especially when you have only a few bulk samples (length(N) <= 3). Fixing targetScale
and sdMeasure
is also strongly recommended when housek0
and target0
are absent (\Delta
Cq method). The functions knownqpcr()
or knownqpcr_unpaired()
, depending on your data format, provide the procedure to estimate the sizes of the experimental parameters using the DNA solutions of known allele mixing ratios.
For the unknown parameters, XInit0
optionally specifies initial values for the optimization using nlm()
though the use of internal default is strongly recommended. The specification as a fixed parameter has higher priority than the specification in XInit0
. Every user-specified parameter values, fixed or unknown, must be given in linear scale (e.g. between 0 and 1 for the allele frequency); they are transformed internally to log or logit.
See Also
Other estimation procedures:
knownqpcr_unpaired()
,
knownqpcr()
,
sim_dummy()
Examples
# Dummy Cq dataset with six bulk samples, each of which comprises of eight haploids.
EPCR <- 0.95; zeroAmount <- 0.0016; # True values for the two must be known.
P <- 0.75
dmy_cq <- make_dummy( rand.seed=1, P=P, K=2, ntrap=6, npertrap=8,
scaleDNA=1e-07, targetScale=1.5, baseChange=0.3,
EPCR=EPCR, zeroAmount=zeroAmount,
sdMeasure=0.3, diploid=FALSE )
print(dmy_cq)
# Estimation with freqpcr, where P, K, targetScale, and baseChange are marked unknown.
result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=0 )
print(result)
# Estimation with freqpcr, assuming diploidy.
result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, diploid=TRUE )
# Estimation where you have knowledge on the size of K.
result <- freqpcr( N=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
K=2,
EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=2 )
# (>= v0.3.2)
# Provided the model is continuous (specify A instead of N).
result <- freqpcr( A=dmy_cq@N, housek0=dmy_cq@housek0, target0=dmy_cq@target0,
housek1=dmy_cq@housek1, target1=dmy_cq@target1,
K=2, EPCR=EPCR, zeroAmount=zeroAmount, beta=TRUE, print.level=1 )
# (>= v0.4.0)
# If the dataset lacks control samples (housek0 and target0 are absent).
# Fixing the sizes of targetScale and sdMeasure is strongly recommended.
result <- freqpcr( N=dmy_cq@N, housek1=dmy_cq@housek1, target1=dmy_cq@target1,
K=2, EPCR=EPCR, zeroAmount=zeroAmount,
targetScale=1.5, sdMeasure=0.3, beta=TRUE, print.level=1 )