NSFG_data {surveyCV} | R Documentation |
Subset of the 2015-2017 National Survey of Family Growth (NSFG): one birth per respondent.
Description
We downloaded this data from the NSFG website and cleaned it following an approach posted to RPubs by Hunter Ratliff.
Usage
NSFG_data
Format
A data frame with 2801 rows and 17 variables:
- CASEID
Respondent ID number (per respondent, not per pregnancy)
- LBW
(originally LBW1) Low birthweight (TRUE/FALSE) for the 1st baby from this pregnancy
- PreMe
(recode of WKSGEST) Whether gestational age was premature (below 37 weeks) or full term
- gotPNcare
(recode of BGNPRENA) Whether or not respondent got prenatal care in first trimester (before 13 weeks)
- KnowPreg
(recode of KNEWPREG) Whether or not respondent learned she was pregnant by 6 weeks
- age
(originally AGECON) Age at time of conception
- income
(originally POVERTY) Income as percent of poverty level, so that 100 = income is at the poverty line; topcoded at 500
- YrEdu
(originally EDUCAT) Education (number of years of schooling)
- race
(originally HISPRACE) Race & Hispanic origin of respondent
- BMI
Body Mass Index
- PregNum
(originally PREGNUM) Respondent's total number of pregnancies
- eduCat
(originally HIEDUC) Highest completed year of school or highest degree received
- GA
(originally WKSGEST) Gestational length of completed pregnancy (in weeks)
- Wanted
(recode of NEWWANTR) Whether or not pregnancy came at right time according to respondent (rather than too soon, too late, or unwanted)
- wgt
(originally WGT2015_2017) Final weight for the 2015-2017 NSFG (at the respondent level, not pregnancy level)
- SECU
Randomized version of cluster ID, or "sampling error computational unit" – these are nested within strata
- strata
(originally SEST) Randomized version of stratum ID
Details
Note that these data were filtered down to include only:
- live births, - with gestational ages below 45 weeks, - born to mothers who were aged 20-40 years old at time of conception;
...then filtered further down to only the *first* such birth per respondent.
Also note that SECUs = Sampling Error Computation Units are effectively pseudo-PSUs, nested within (pseudo-)strata. See page 35 of the NSFG 2011-2013 sample design documentation for details.
Source
https://www.cdc.gov/nchs/nsfg/nsfg_2015_2017_puf.htm
https://rpubs.com/HunterRatliff1/NSFG_Wrangle
https://www.cdc.gov/nchs/data/nsfg/nsfg_2011_2013_sampledesign.pdf