smdi_na_indicator {smdi}R Documentation

Create binary missing indicator variables by two different strategies

Description

This function takes a dataframe and creates binary missing indicator variable. This can be realized with two different approaches:

Approach 1 (drop_NA_col = FALSE): creates a binary missing indicator variable for partially observed variables and retains both original and indicator variables.

Approach 2 (drop_NA_col = TRUE): creates a binary missing indicator variable for partially observed variables and only retains indicator variables (and drops the original variables).

Important: Make sure you have your variables format correct and avoid to include variables like ID variables, ZIP codes, dates, etc.

Usage

smdi_na_indicator(data = NULL, covar = NULL, drop_NA_col = TRUE)

Arguments

data

dataframe or tibble object with partially observed/missing variables

covar

character covariate or covariate vector with partially observed variable/column name(s) to investigate. If NULL, the function automatically includes all columns with at least one missing observation.

drop_NA_col

logical, drop specified columns with NA (default) or retain those columns

Value

returns the dataframe with missing indicator variables (column names are ending on "_NA")

Examples

library(smdi)
library(dplyr)

smdi_data %>%
  smdi_na_indicator(drop_NA_col = FALSE) %>%
  glimpse()

smdi_data %>%
  smdi_na_indicator(drop_NA_col = TRUE) %>%
  glimpse()


[Package smdi version 0.3.0 Index]