caco {QSARdata}R Documentation

Caco-2 Permeability Data

Description

These data were compiled and described by Pham-The et al. (2013). The data set consists compounds that were designated as high, medium or low permeability. The structures and outcomes were obtained from the supporting information at http://doi.wiley.com/10.1002/minf.201200166. These data are from Table SI1 and Table SI4. Some compounds failed in descriptor calculations so the total sample size here is 3796 compounds.

The package contains none sets of molecular descriptors: atom pair distances, Dragon descriptors (http://www.talete.mi.it/products/dragon_plus.htm), PipelinePilot fingerprints (http://accelrys.com/products/pipeline-pilot/) and QuickProp descriptors.

For fingerprints, the 1000 most variable bits were selected whenever possible.

Usage

data(caco)

Format

The data consist of several data frames. The first column of the descriptor data frames is called "Molecule" representing the compounds. The original identifiers were chewed-up during the descriptor calculations and have been give unique but arbitrary values to merge across descriptor sets.

caco_AtomPair

Atom pair descriptors

caco_Dragon

Dragon descriptors (http://www.talete.mi.it/products/dragon_plus.htm)

caco_PipelinePilot_FP

PipelinePilot fingerprints (http://accelrys.com/products/pipeline-pilot/)

caco_QuickProp

QuickProp descriptors

caco_Outcome

a data frame with columns for the molecule name and the outcome (for merging)

References

Pham-The, H., Gonzalez-Alvarez, I., Bermejo, M., Garrigues, T., Le-Thi-Thu, H., & Cabrera-Perez, M. A. (2013). The Use of Rule-Based and QSPR Approaches in ADME Profiling: A Case Study on Caco-2 Permeability. Molecular Informatics.

Examples

data(caco)
head(caco_Outcome)

[Package QSARdata version 1.3 Index]