| wdbc {mclust} | R Documentation |
UCI Wisconsin Diagnostic Breast Cancer Data
Description
The data set provides data for 569 patients on 30 features of the cell nuclei obtained from a digitized image of a fine needle aspirate (FNA) of a breast mass. For each patient the cancer was diagnosed as malignant or benign.
Usage
data(wdbc)
Format
A data frame with 569 observations on the following variables:
IDID number
Diagnosiscancer diagnosis:
M= malignant,B= benignRadius_meana numeric vector
Texture_meana numeric vector
Perimeter_meana numeric vector
Area_meana numeric vector
Smoothness_meana numeric vector
Compactness_meana numeric vector
Concavity_meana numeric vector
Nconcave_meana numeric vector
Symmetry_meana numeric vector
Fractaldim_meana numeric vector
Radius_sea numeric vector
Texture_sea numeric vector
Perimeter_sea numeric vector
Area_sea numeric vector
Smoothness_sea numeric vector
Compactness_sea numeric vector
Concavity_sea numeric vector
Nconcave_sea numeric vector
Symmetry_sea numeric vector
Fractaldim_sea numeric vector
Radius_extremea numeric vector
Texture_extremea numeric vector
Perimeter_extremea numeric vector
Area_extremea numeric vector
Smoothness_extremea numeric vector
Compactness_extremea numeric vector
Concavity_extremea numeric vector
Nconcave_extremea numeric vector
Symmetry_extremea numeric vector
Fractaldim_extremea numeric vector
Details
The recorded features are:
-
Radiusas mean of distances from center to points on the perimeter -
Textureas standard deviation of gray-scale values -
Perimeteras cell nucleus perimeter -
Areaas cell nucleus area -
Smoothnessas local variation in radius lengths -
Compactnessas cell nucleus compactness, perimeter^2 / area - 1 -
Concavityas severity of concave portions of the contour -
Nconcaveas number of concave portions of the contour -
Symmetryas cell nucleus shape -
Fractaldimas fractal dimension, "coastline approximation" - 1
For each feature the recorded values are computed from each image as <feature_name>_mean, <feature_name>_se, and <feature_name>_extreme, for the mean, the standard error, and the mean of the three largest values.
Source
The Breast Cancer Wisconsin (Diagnostic) Data Set (wdbc.data, wdbc.names) from the UCI Machine Learning Repository
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic). Please note the UCI conditions of use.
References
Mangasarian, O. L., Street, W. N., and Wolberg, W. H. (1995) Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4), pp. 570-577.