make_demo_data {labelr}R Documentation

Construct a Fake Demographic Data Frame

Description

make_demo_data generates a data.frame with select (entirely fictional) "demographic" variables purely for the purposes of demonstrating or exploring common labelr behaviors and uses and is not designed to accurately emulate or represent the frequencies or relationships among demographic variables.

Usage

make_demo_data(
  n = 1000,
  age.mean = 43,
  age.sd = 15,
  gend.prob = c(0.45, 0.45, 0.045, 0.045, 0.01),
  raceth.prob = c(1/7, 1/7, 1/7, 1/7, 1/7, 1/7, 1/7),
  edu.prob = c(0.03, 0.32, 0.29, 0.24, 0.12),
  rownames = TRUE
)

Arguments

n

number of observations (rows) of hypothetical data set to create.

age.mean

mean value of (fictional) age variable (assuming a normal distribution) recorded in a hypothetical data set.

age.sd

standard deviation of (fictional) age variable (assuming a normal distribution) recorded in a hypothetical data set.

gend.prob

probabilities of four categories of a gender identity variable recorded in a hypothetical data set.

raceth.prob

probabilities of categories of a hypothetical race/ethnicity variable recorded in a hypothetical data set.

edu.prob

probabilities of categories of a hypothetical "highest level of education" variable recorded in a hypothetical data set.

rownames

create memorable but arbitrary rownames for inspection (if TRUE).

Value

a data.frame.

Examples

# make toy demographic (gender, race, etc.) data set
set.seed(555)
df <- make_demo_data(n = 1000)
df <- add_val_labs(df,
  vars = "raceth", vals = c(1:7),
  labs = c("White", "Black", "Hispanic", "Asian", "AIAN", "Multi", "Other"),
  max.unique.vals = 50
)
head(df)
summary(df)

[Package labelr version 0.1.5 Index]