| adult {liver} | R Documentation |
adult data set
Description
The adult dataset was collected from the US Census Bureau and the primary task is to predict whether a given adult makes more than $50K a year based attributes such as education, hours of work per week, etc. The target feature is income, a factor with levels "<=50K" and ">50K", and the remaining 14 variables are predictors.
Usage
data( adult )
Format
The adult dataset, as a data frame, contains 48598 rows and 15 columns (variables/features). The 15 variables are:
-
age: age in years. -
workclass: a factor with 6 levels. -
demogweight: the demographics to describe a person. -
education: a factor with 16 levels. -
education.num: number of years of education. -
marital.status: a factor with 5 levels. -
occupation: a factor with 15 levels. -
relationship: a factor with 6 levels. -
race: a factor with 5 levels. -
gender: a factor with levels "Female","Male". -
capital.gain: capital gains. -
capital.loss: capital losses. -
hours.per.week: number of hours of work per week. -
native.country: a factor with 42 levels. -
income: yearly income as a factor with levels "<=50K" and ">50K".
Details
This dataset can be downloaded from the UCI machine learning repository:
http://www.cs.toronto.edu/~delve/data/adult/desc.html
A detailed description of the dataset can be found in the UCI documentation at:
http://www.cs.toronto.edu/~delve/data/adult/adultDetail.html
References
Kohavi, R. (1996). Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. Kdd.
See Also
risk, churn, churnTel, bank, advertising, marketing, insurance, cereal, housePrice, house
Examples
data( adult )
str( adult )