bcancer {VIM} | R Documentation |
Breast cancer Wisconsin data set
Description
Dataset containing the original Wisconsin breast cancer data.
Format
A data frame with 699 observations on the following 11 variables.
- ID
Sample ID
- clump_thickness
as integer from 1 - 10
- uniformity_cellsize
as integer from 1 - 10
- uniformity_cellshape
as integer from 1 - 10
- adhesion
as integer from 1 - 10
- epithelial_cellsize
as integer from 1 - 10
- bare_nuclei
as integer from 1 - 10, includes 16 missings
- chromatin
as integer from 1 - 10
- normal_nucleoli
as integer from 1 - 10
- mitoses
as integer from 1 - 10
- class
benign or malignant
References
The data downloaded and conditioned for R from the UCI machine learning repository, see https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original) This breast cancer databases was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg. If you publish results when using this database, then please include this information in your acknowledgements. Also, please cite one or more of: O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via linear programming", SIAM News, Volume 23, Number 5, September 1990, pp 1 & 18. William H. Wolberg and O.L. Mangasarian: "Multisurface method of pattern separation for medical diagnosis applied to breast cytology", Proceedings of the National Academy of Sciences, U.S.A., Volume 87, December 1990, pp 9193-9196. O. L. Mangasarian, R. Setiono, and W.H. Wolberg: "Pattern recognition via linear programming: Theory and application to medical diagnosis", in: "Large-scale numerical optimization", Thomas F. Coleman and Yuying Li, editors, SIAM Publications, Philadelphia 1990, pp 22-30. K. P. Bennett & O. L. Mangasarian: "Robust linear programming discrimination of two linearly inseparable sets", Optimization Methods and Software 1, 1992, 23-34 (Gordon & Breach Science Publishers).
Examples
data(bcancer)
aggr(bcancer)