generate.HHID {surveysd}R Documentation

Generate new houshold ID for survey data with rotating panel design taking into account split households

Description

Generating a new houshold ID for survey data using a houshold ID and a personal ID. For surveys with rotating panel design containing housholds, houshold members can move from an existing household to a new one, that was not originally in the sample. This leads to the creation of so called split households. Using a peronal ID (that stays fixed over the whole survey), an indicator for different time steps and a houshold ID, a new houshold ID is assigned to the original and the split household.

Usage

generate.HHID(dat, period = "RB010", pid = "RB030", hid = "DB030")

Arguments

dat

data table of data frame containing the survey data

period

column name of dat containing an indicator for the rotations, e.g years, quarters, months, ect...

pid

column name of dat containing the personal identifier. This needs to be fixed for an indiviual throught the whole survey

hid

column name of dat containing the household id. This needs to for a household throught the whole survey

Value

the survey data dat as data.table object containing a new and an old household ID. The new household ID which considers the split households is now named hid and the original household ID has a trailing "_orig".

Examples

## Not run: 
library(surveysd)
library(laeken)
library(data.table)

eusilc <- surveysd:::demo.eusilc(n=4)

# create spit households
eusilc[,rb030split:=rb030]
year <- eusilc[,unique(year)]
year <- year[-1]
leaf_out <- c()
for(y in year) {
  split.person <- eusilc[year==(y-1)&!duplicated(db030)&!db030%in%leaf_out,
                         sample(rb030,20)]
  overwrite.person <- eusilc[year==(y)&!duplicated(db030)&!db030%in%leaf_out,
                             .(rb030=sample(rb030,20))]
  overwrite.person[,c("rb030split","year_curr"):=.(split.person,y)]

  eusilc[overwrite.person,
         rb030split:=i.rb030split,on=.(rb030,year>=year_curr)]
  leaf_out <- c(
    leaf_out,
    eusilc[rb030%in%c(overwrite.person$rb030,overwrite.person$rb030split),
    unique(db030)])
}

# pid which are in split households
eusilc[,.(uniqueN(db030)),by=list(rb030split)][V1>1]

eusilc.new <- generate.HHID(eusilc, period = "year", pid = "rb030split",
                            hid = "db030")

# no longer any split households in the data
eusilc.new[,.(uniqueN(db030)),by=list(rb030split)][V1>1]

## End(Not run)


[Package surveysd version 1.3.1 Index]