dhs {childfree} | R Documentation |
Read and recode UN Demographic and Health Surveys (DHS) individual data
Description
Read and recode UN Demographic and Health Surveys (DHS) individual data
Usage
dhs(files, extra.vars = NULL, progress = TRUE)
Arguments
files |
vector: a character vector containing the paths for one or more Individual Recode DHS data files (see details) |
extra.vars |
vector: a character vector containing the names of variables to be retained from the raw data |
progress |
boolean: display a progress bar |
Details
The Demographic and Health Surveys (DHS) program regularly collects
health data from population-representative samples in many countries using standardized surveys since 1984. The
"individual recode" data files contain women's responses, while the "men recode" files contain men's responses. These
files are available in SPSS, SAS, and Stata formats from https://www.dhsprogram.com/,
however access requires a free application. The dhs()
function
reads one or more of these files, extracts and recodes selected variables useful for studying childfree adults and other
family statuses, then returns a single data frame.
Although access to DHS data requires an application, the DHS program provides model datasets for practice. The example provided below uses the model data file "ZZIR62FL.SAV", which contains fictitious women's data, but has the same structure as real DHS data files. The example can be run without prior application for data access.
Known issues
The SPSS-formatted files containing data from Gabon Recode 4 (GAIR41FL.SAV, GAMR41FL.SAV) and Turkey Recode 4 (TRIR41FL.SAV, TRMR41FL.SAV) contain encoding errors. Use the SAS-formatted files (GAIR41FL.SAS7BDAT, GAMR41FL.SAS7BDAT, TRIR41FL.SAS7BDAT, TRMR41FL.SAS7BDAT) instead.
In some cases, DHS makes available individual recode data files for specific regions. For example, women's data from individual states in India from 1999 are contained in files named XXIR42FL.SAV, where the "XX" is a two-letter state code. The
dhs()
function has only been tested using whole-country files, and may not perform as expected for regional files.Variables containing women's responses in the individual recode files begin with
v
, while variables containing men's responses in the men recode files begin withmv
. When applyingdhs()
to both female and male data, these are automatically harmonized. However, if extra variables are requested using theextra.vars
option, be sure to specify both names (e.g.extra.vars = c("v201", "mv201")
).
Value
A data frame containing variables described in the codebook available using vignette("codebooks")
Examples
data <- dhs(files = c("ZZIR62FL.SAV"), extra.vars = c("v201"))