hearth {ordinalForest} | R Documentation |
Data on Coronary Artery Disease
Description
This data includes 294 patients undergoing angiography at the Hungarian Institute of Cardiology in Budapest between 1983 and 1987.
Format
A data frame with 294 observations, ten covariates and one ordinal target variable
Details
The variables are as follows:
-
age
. numeric. Age in years -
sex
. factor. Sex (1 = male; 0 = female) -
chest_pain
. factor. Chest pain type (1 = typical angina; 2 = atypical angina; 3 = non-anginal pain; 4 = asymptomatic) -
trestbps
. numeric. Resting blood pressure (in mm Hg on admission to the hospital) -
chol
. numeric. Serum cholestoral in mg/dl -
fbs
. factor. Fasting blood sugar > 120 mg/dl (1 = true; 0 = false) -
restecg
. factor. Resting electrocardiographic results (1 = having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV); 0 = normal) -
thalach
. numeric. Maximum heart rate achieved -
exang
. factor. Exercise induced angina (1 = yes; 0 = no) -
oldpeak
. numeric. ST depression induced by exercise relative to rest -
Class
. factor. Ordinal target variable - severity of coronary artery disease (determined using angiograms) (1 = no disease; 2 = degree 1; 3 = degree 2; 4 = degree 3; 5 = degree 4)
The original openML dataset was pre-processed in the following way:
1. The variables were re-named according to the description given on openML.
2. The missing values which were coded as "-9" were replaced by NA values.
3. The variables slope
, ca
, and thal
were excluded, because these featured
too many missing values.
4. The categorical covariates were transformed into factors.
5. There were 6 restecg
values of "2" which were replaced by "1".
6. The missing values were imputed: The missing values of the numerical covariates were replaced by the means of the corresponding non-missing values. The missing values of the categorical covariates were replaced by the modes of the corresponding non-missing values.
Source
OpenML: data.name: heart-h, data.id: 1565, link: https://www.openml.org/d/1565/
References
Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J.-J., Sandhu, S., Guppy, K. H., Lee, S., Froelicher, V. (1989) International application of a new probability algorithm for the diagnosis of coronary artery disease. The American Journal Of Cardiology, 64, 304–310.
Vanschoren, J., van Rijn, J. N., Bischl, B., Torgo, L. (2013) OpenML: networked science in machine learning. SIGKDD Explorations, 15(2), 49–60.
Examples
data(hearth)
table(hearth$Class)
dim(hearth)
head(hearth)