optimFPM {RFPM} | R Documentation |
Optimization of Floating Percentile Model Parameters
Description
Calculate parameter inputs that optimize benchmark performance
Usage
optimFPM(
data,
paramList,
FN_crit = seq(0.1, 0.9, by = 0.05),
alpha.test = seq(0.05, 0.5, by = 0.05),
which = c(1, 2, 3, 4),
simplify = TRUE,
plot = TRUE,
colors = heat.colors(10),
colsteps = 100,
...
)
Arguments
data |
data.frame containing, at a minimum, chemical concentrations as columns and a logical |
paramList |
character vector of column names of chemical concentration variables in |
FN_crit |
numeric vector over which to optimize false negative thresholds (default = |
alpha.test |
numeric vector of type-I error rate values over which to optimize (default = |
which |
numeric or character indicating which type of plot to generate (see Details; default = |
simplify |
logical; whether to generate simplified output (default = |
plot |
logical; whether to generate a plot to visualize the opimization results |
colors |
values recognizible as colors to be passed to |
colsteps |
integer; number of discrete steps to interpolate colors in |
... |
additional argument passed to |
Details
optimFPM
was designed to help optimize the predictive capacity of the benchmarks generated by FPM
. The default input parameters to
FPM
(i.e., FN_crit = 0.2
and alpha.test = 0.05
) are arbitrary, and optimization can help to objectively establish more accurate benchmarks.
Graphical output from optimFPM
can also help users to understand the relationship(s) between benchmark accuracy/error, FN_crit
, and alpha.test
.
Default inputs for FN_crit
and alpha.test
were selected to represent a reasonable range of values to test. Testing over both ranges
will result in a two-way optimization, which can be computationally intensive. Alternatively, optimFPM
can be run for one parameter at a time
by specifying a single value for FN_crit
or alpha.test
. Note that inputting single values for both FN_crit
and alpha.test
will generate unhelpful results.
Several metrics are used for optimization:
Ratio of sensitivity/specificity ("sensSpecRatio"), calculated as the minimum of the two metrics divided by the maximum of the two. Therefore, this value will always be between 0 and 1, representing the balance between correct
Hit==TRUE
andHit==FALSE
predictions.Overall reliability ("OR") (i.e., probability of correctly predicting
Hit
values)Fowlkes-Mallows Index ("FM") - an average of metrics focusing on predicting
Hit==TRUE
Matthew's Correlation Coefficient ("MCC") - a measure of the correspondence between the data and predictions analogous to a Pearson's correlation coefficient (but for binary data)
Graphical output will differ depending on whether or not a single value is input for FN_crit
or alpha.test
. Providing a single value for one
of the two arguments will generate a line graph, whereas providing longer vectors (i.e., length > 1) of inputs for both arguments will generate dot matrix plots using colors
to generate
a color palette and colsteps
to define the granularity of the color gradient with the palette. The order of colors
will be plotted
from more optimal to less optimal; for example, the default of heat.colors(10)
will show optimal colors as red and less optimal colors as yellower.
By default, multiple plots will be generated, however the which
argument can control which plots are generated. Inputs
to which
are, by default, c(1, 2, 3, 4)
for the metrics noted above, and flexible character inputs also can be used to a degree.
Black squares indicate the optimal argument inputs; these values are also printed to the console and can be assigned to an object.
Value
data.frame of optimized FN_crit
and/or alpha.test
values
See Also
FPM, colorGradient, colorRampPalette
Examples
paramList = c("Cd", "Cu", "Fe", "Mn", "Ni", "Pb", "Zn")
FN_seq <- seq(0.1, 0.3, 0.05)
alpha_seq <- seq(0.05, 0.2, 0.05)
optimFPM(h.tristate, paramList, FN_seq, 0.05)
optimFPM(h.tristate, paramList, 0.2, alpha_seq)
optimFPM(h.tristate, paramList, FN_seq, alpha_seq, which=2)