chemSig {RFPM} | R Documentation |
Chemical Variable Selection within the Floating Percentile Model
Description
Determine which chemicals in a dataset have significantly higher concentrations among toxic samples by comparison to non-toxic samples
Usage
chemSig(
data,
paramList,
testType = NULL,
alpha.var = 0.05,
alpha.norm = 0.05,
alpha.test = 0.05,
alpha = NULL,
alternative = "less",
var.alternative = "two.sided",
var.equal = NULL,
warn = TRUE,
ExcelMode = NULL
)
Arguments
data |
data.frame containing, at a minimum, chemical concentrations as columns and a logical |
paramList |
character vector of column names of chemical concentration variables in |
testType |
character string; whether to run parametric or non-parametric tests (default = |
alpha.var |
numeric value between 0 and 1; type-I error rate for testing equal variance assumption (default = |
alpha.norm |
numeric value between 0 and 1; type-I error rate for testing normality assumption (default = |
alpha.test |
numeric value between 0 and 1; type-I error rate for testing differences between Hit and No-hit datasets (default = |
alpha |
numeric value between 0 and 1; type-I error rate that, if supplied by the user, will be applied to all tests (default = |
alternative |
alternative hypothesis type for equality of central tendency (default = |
var.alternative |
alternative hypothesis type for equal variance test (default = |
var.equal |
logical; whether to assume equal variance (default = |
warn |
logical; whether to generate a warning associated with |
ExcelMode |
logical; whether to force |
Details
chemSig
is called within FPM
via chemSigSelect
, which generates a subset of chemicals to
pass into the floating percentile model algorithm. chemSig
only returns a logical vector describing which parameters in paramList
should be selected for
benchmark development based on having significantly higher concentrations when Hit == TRUE
than when Hit == FALSE
.
The user has the ability to manipulate several of the parameters of the selection algorithm, or they can allow chemSig
to test for
assumptions and use appropriate hypothesis tests based on those results. By default, chemSig
will use shapiro.test
to
confirm normality, then either var.test
if the data are normal or fligner.test
if the data are non-normal to confirm equal variance. Finally, the function will use t.test
if the data are normal (using the Welch method if unequal variance),
wilcox.test
if non-normal with equal variance, or brunner.munzel.test
if non-normal with unequal variance.
The testType
argument can be one of p
, P
, param
, Param
, parametric
, or Parametric
for parametric
test types or non
, Non
, np
, NP
, nonparam
, Nonparam
, non-param
, Non-param
, nonparametric
, Nonparametric
,
non-parametric
, Non-parametric
, or Non-Parametric
.
The user has the option of providing a single alpha
level to apply to all tests (e.g., 0.05) or to specify test-specific alpha levels via alpha.var
, alpha.norm
, and alpha.test
.
Note that FPM
by default uses alpha = 0.05
for all tests.
While alternative
and var.alternative
can be adjusted, we strongly recommend that they not be changed from the
default values. For example, changing alternative
from "less"
(default) to "two.sided"
would result in
the assumption that chemical concentrations could be significantly higher (as well as lower) when there is no toxicity than when there is, which is inappropriate.
The "greater" alternative, which is never appropriate, is not an accepted input for alternative
. Similarly, the
assumption of equal variance relates to a "two.sided"
argument, therefore changing the var.alternative
to be
"less"
or "greater"
would not be appropriate.
ExcelMode
assumes testType = "parametric"
, var.equal = TRUE
, alternative = "less"
, and alpha = 0.1
. In
actuality, the Excel-based tool uses a one-way ANOVA test to compare two levels of Hit
, which is equivalent to a t-test so long as alpha is adjusted to 0.1.
Thus, testType
, alternative
, var.alternative
, and var.equal
are overridden when ExcelMode = TRUE
.
This argument was included for those interested in using 'RFPM' as an alternative to the Excel-based calculator tool to obtain identical benchmark results. Note that
the Excel FPM tool also includes other features which may complicate comparability such as outlier analysis. Outlier analysis is not conducted by RFPM.
Value
named logical vector
Examples
paramList = c("Cd", "Cu", "Fe", "Mn", "Ni", "Pb", "Zn")
chemSig(h.tristate, paramList, "nonparametric")
chemSig(h.tristate, paramList, "parametric")