health.retirement {fairml} | R Documentation |
Health and Retirement Survey
Description
The University of Michigan Health and Retirement Study (HRS) longitudinal dataset.
Usage
data(health.retirement)
Format
The data contains 38653 observations and 27 variables.
Note
The data set has been minimally pre-processed: the redundant variables
HISPANIC
and BITHYR
were removed, along with the patient ID
PID
. A single patient was recorded twice: the duplicate has been
removed. However, incomplete observations have been left in the data set.
The number of dependencies in daily activities score
is the response
(count) variable and marriage
, gender
, race
,
race.ethnicity
and age
are the sensitive attributes. The
remaining variables are used as predictors.
The data contain the following variables:
-
year
, the year of retirement as a numeric variable; -
age
, the age as a numeric variable; -
educa
, the number of years in education as a numeric variable; -
networth
, household net worth as a numeric variable; -
cognition_catnew
cognistion assessment as a numeric variable; -
bmi
as a numeric variable; -
hlthrte
, a numeric health rating; -
bloodp
, blood pressure diagnosis as a numeric variable; -
diabetes
, diabetes diagnosis as a numeric variable; -
cancer
, cancer diagnosis as a numeric variable; -
lung
, lung disease diagnosis as a numeric variable; -
heart
, heart condition diagnosis as a numeric variable; -
stroke
, stroke diagnosis as a numeric variable; -
pchiat
, psychiatric condition diagnosis as a numeric variable; -
arthrit
, arthritis diagnosis as a numeric variable; -
fall
, recently falling as a numeric variable; -
pain
, pain conditions as a numeric variable; -
A1c_adj
, biomarker for hemoglobin A1C; -
CRP_adj
, biomarker for C-reactive protein; -
CYSC_adj
, biomarker for Cystatin C; -
HDL_adj
, biomarker for HDL cholesterol; -
TC_adj
, biomarker for total cholesterol; -
score
, another numeric health rating; -
gender
, a factor with levels"Female"
and"Male"
; -
marriage
, a factor with levels"Married/Partner"
and"Not Married"
; -
race
, a factor withe levels"Black"
,"Other"
and"White"
; -
race.ethnicity
, a factor withe levels"Hispanic"
,"NHB"
,"NHW"
and"Other"
.
References
https://hrs.isr.umich.edu/about
Examples
data(health.retirement)
# complete data analysis.
health.retirement = health.retirement[complete.cases(health.retirement), ]
# short-hand variable names.
r = health.retirement[, "score"]
s = health.retirement[, c("marriage", "gender", "race", "age")]
p = health.retirement[, setdiff(names(health.retirement), c(names(r), names(s)))]
# drop the second race variable.
p = p[, colnames(p) != "race.ethnicity"]
## Not run:
# the lambda = 0.1 is very helpful in making model estimation succeed.
m = fgrrm(response = r, sensitive = s, predictors = p, ,
family = "poisson", unfairness = 0.05, lambda = 0.1)
summary(m)
## End(Not run)