chemSig {RFPM}R Documentation

Chemical Variable Selection within the Floating Percentile Model

Description

Determine which chemicals in a dataset have significantly higher concentrations among toxic samples by comparison to non-toxic samples

Usage

chemSig(
  data,
  paramList,
  testType = NULL,
  alpha.var = 0.05,
  alpha.norm = 0.05,
  alpha.test = 0.05,
  alpha = NULL,
  alternative = "less",
  var.alternative = "two.sided",
  var.equal = NULL,
  warn = TRUE,
  ExcelMode = NULL
)

Arguments

data

data.frame containing, at a minimum, chemical concentrations as columns and a logical Hit column classifying toxicity

paramList

character vector of column names of chemical concentration variables in data

testType

character string; whether to run parametric or non-parametric tests (default = NULL). See Details for more information.

alpha.var

numeric value between 0 and 1; type-I error rate for testing equal variance assumption (default = 0.05)

alpha.norm

numeric value between 0 and 1; type-I error rate for testing normality assumption (default = 0.05)

alpha.test

numeric value between 0 and 1; type-I error rate for testing differences between Hit and No-hit datasets (default = 0.05)

alpha

numeric value between 0 and 1; type-I error rate that, if supplied by the user, will be applied to all tests (default = NULL)

alternative

alternative hypothesis type for equality of central tendency (default = "less")

var.alternative

alternative hypothesis type for equal variance test (default = "two.sided")

var.equal

logical; whether to assume equal variance (default = NULL)

warn

logical; whether to generate a warning associated with ExcelMode (default = TRUE)

ExcelMode

logical; whether to force chemSig to run like the WA Department of Ecology's Excel-based floating percentile model calculator (default = FALSE)

Details

chemSig is called within FPM via chemSigSelect, which generates a subset of chemicals to pass into the floating percentile model algorithm. chemSig only returns a logical vector describing which parameters in paramList should be selected for benchmark development based on having significantly higher concentrations when Hit == TRUE than when Hit == FALSE. The user has the ability to manipulate several of the parameters of the selection algorithm, or they can allow chemSig to test for assumptions and use appropriate hypothesis tests based on those results. By default, chemSig will use shapiro.test to confirm normality, then either var.test if the data are normal or fligner.test if the data are non-normal to confirm equal variance. Finally, the function will use t.test if the data are normal (using the Welch method if unequal variance), wilcox.test if non-normal with equal variance, or brunner.munzel.test if non-normal with unequal variance.

The testType argument can be one of p, P, param, Param, parametric, or Parametric for parametric test types or non, Non, np, NP, nonparam, Nonparam, non-param, Non-param, nonparametric, Nonparametric, non-parametric, Non-parametric, or Non-Parametric.

The user has the option of providing a single alpha level to apply to all tests (e.g., 0.05) or to specify test-specific alpha levels via alpha.var, alpha.norm, and alpha.test. Note that FPM by default uses alpha = 0.05 for all tests.

While alternative and var.alternative can be adjusted, we strongly recommend that they not be changed from the default values. For example, changing alternative from "less" (default) to "two.sided" would result in the assumption that chemical concentrations could be significantly higher (as well as lower) when there is no toxicity than when there is, which is inappropriate. The "greater" alternative, which is never appropriate, is not an accepted input for alternative. Similarly, the assumption of equal variance relates to a "two.sided" argument, therefore changing the var.alternative to be "less" or "greater" would not be appropriate.

ExcelMode assumes testType = "parametric", var.equal = TRUE, alternative = "less", and alpha = 0.1. In actuality, the Excel-based tool uses a one-way ANOVA test to compare two levels of Hit, which is equivalent to a t-test so long as alpha is adjusted to 0.1. Thus, testType, alternative, var.alternative, and var.equal are overridden when ExcelMode = TRUE. This argument was included for those interested in using 'RFPM' as an alternative to the Excel-based calculator tool to obtain identical benchmark results. Note that the Excel FPM tool also includes other features which may complicate comparability such as outlier analysis. Outlier analysis is not conducted by RFPM.

Value

named logical vector

Examples

paramList = c("Cd", "Cu", "Fe", "Mn", "Ni", "Pb", "Zn")
chemSig(h.tristate, paramList, "nonparametric")
chemSig(h.tristate, paramList, "parametric")

[Package RFPM version 1.1 Index]