tcplSubsetChid {tcpl} | R Documentation |
Subset level 5 data to a single sample per chemical
Description
tcplSubsetChid
subsets level 5 data to a single tested sample per
chemical. In other words, if a chemical is tested more than once (a chid
has more than one spid) for a given assay endpoint, the function uses a
series of logic to select a single "representative" sample.
Usage
tcplSubsetChid(dat, flag = TRUE, type = "mc", export_ready = TRUE)
Arguments
dat |
data.table, a data.table with level 5 data |
flag |
Integer, the mc6_mthd_id values to go into the flag count, see details for more information |
type |
Character of length 1, the data type, "sc" or "mc" |
export_ready |
Boolean, default TRUE, should only export ready 1 values be included in calculation |
Details
tcplSubsetChid
is intended to work with level 5 data that has
chemical and assay information mapped with tcplPrepOtpt
.
To select a single sample, first a "consensus hit-call" is made by majority rule, with ties defaulting to active. After the chemical-wise hit call is made, the samples corresponding to to chemical-wise hit call are logically ordered using the fit category, the number of the flags, and the modl_ga, then the first sample for every chemical is selected.
The flag
param can be used to specify a subset of flags to be used in
the flag count. Leaving flag
TRUE utilize all the available flags.
Setting flag
to FALSE
will do the subsetting without
considering any flags.
Value
A data.table with a single sample for every given chemical-assay pair.
See Also
Examples
## Store the current config settings, so they can be reloaded at the end
## of the examples
conf_store <- tcplConfList()
tcplConfExample()
## Load the example level 5 data
d1 <- tcplLoadData(lvl = 5, fld = "aeid", val = 797)
d1 <- tcplPrepOtpt(d1)
## Subset to an example of a duplicated chid
d2 <- d1[chid == 20182]
d2[, list(m4id, hitc, fitc, modl_ga)]
## Here the consensus hit-call is 1 (active), and the fit categories are
## all equal. Therefore, if the flags are ignored, the selected sample will
## be the sample with the lowest modl_ga.
tcplSubsetChid(dat = d2, flag = FALSE)[, list(m4id, modl_ga)]
## Reset configuration
options(conf_store)