tobacco {summarytools} | R Documentation |
Tobacco Use and Health - Simulated Dataset
Description
A simulated datasets of 1,000 subjects, with the following variables:
Usage
data(tobacco)
Format
A data frame with 1000 rows and 9 variables
Details
gender Factor with 2 levels: “F” and “M”, having roughly 500 of each.
age Numerical.
age.gr Factor with 4 age categories.
BMI Body Mass Index (numerical).
smoker Factor (“Yes” / “No”).
cigs.per.day Number of cigarettes smoked per day (numerical).
diseased Factor (“Yes” / “No”).
disease Character.
samp.wgts Sampling weights (numerical).
A note on simulation: probability for an individual to fall into category “diseased” is based on an arbitrary function involving age, BMI and number of cigarettes per day.
A copy of this dataset is also available in French under the name “tabagisme”.
[Package summarytools version 1.0.1 Index]