drug.consumption {fairml} | R Documentation |
Drug Consumption
Description
Predict drug consumption based on psychological scores and demographics.
Usage
data(drug.consumption)
Format
The data contains 1885 observations and 31 variables. See the UCI Machine Learning Repository for details.
Note
The data set has been minimally pre-processed following the instructions on the UCI Machine Learning Repository to re-encode the variables. Categorical variables are stored as factors and the psychological scores are stored as numeric variables on their original scales.
Any of the drug use variables can be used as the response variable
(with 7 different levels); Age
, Gender
and Race
are the
sensitive attributes. The remaining variables are used as predictors.
The data contain the following variables:
-
Age
, a factor with 6 10-years age brackets; -
Gender
, as a factor; -
Education
, a factor with 9 levels from"Left school before 16"
to"Doctorate degree"
; -
Country
, a factor with 7 different levels for"USA"
,"New Zealand"
,"Other"
,"Australia"
,"Republic of Ireland" "Canada"
and"UK"
; -
Race
a factor with 7 levels comprising mixed backgrounds as well; -
Nscore
,Escore
,Oscore
,Ascore
,Cscore
, numeric scores from the five-factor model for personality traits; -
Impulsive
, a numeric score for impulsivity; -
SS
, a numeric score for sensation seeking; -
Alcohol
,Amphet
,Amyl
,Benzos
,Caff
,Cannabis
,Choc
,Coke
,Crack
,Ecstasy
,Heroin
,Ketamine
,Legalh
,LSD
,Meth
,Mushrooms
,Nicotine
,Semer
andVSA
: factors with 7 levels ranging from"Never Used"
to"Used in Last Day"
.
References
UCI Machine Learning Repository.
https://archive-beta.ics.uci.edu/dataset/373/
Examples
data(drug.consumption)
# short-hand variable names.
r = drug.consumption[, "Meth"]
s = drug.consumption[, c("Age", "Gender", "Race")]
p = drug.consumption[, c("Education", "Nscore", "Escore", "Oscore", "Ascore",
"Cscore", "Impulsive", "SS")]
# collapse levels with low observed frequencies.
levels(p$Education) =
c("at.most.18y", "at.most.18y", "at.most.18y", "at.most.18y", "university",
"diploma", "bachelor", "master", "phd")
## Not run:
m = fgrrm(response = r, sensitive = s, predictors = p, ,
family = "multinomial", unfairness = 0.05)
summary(m)
HH = drug.consumption$Heroin
levels(HH) = c("Never Used", "Used", "Used", "Used", "Used Recently",
"Used Recently", "Used Recently")
m = fgrrm(response = HH, sensitive = s, predictors = p, ,
family = "multinomial", unfairness = 0.05)
summary(m)
## End(Not run)