R: Missing Variable Imputation with Last Time Point Value...

doLTCF {simcausal}

R Documentation

Missing Variable Imputation with Last Time Point Value Carried Forward (LTCF)

Description

Forward imputation for missing variable values in simulated data after a particular end of the follow-up event. The end of follow-up event is defined by the node of type EOF=TRUE being equal to 1.

Usage

doLTCF(data, LTCF)

Arguments

`data`	Simulated `data.frame` in wide format
`LTCF`	Character string specifying the outcome node that is the indicator of the end of follow-up (observations with value of the outcome variable being 1 indicate that the end of follow-up has been reached). The outcome variable must be a binary node that was declared with `EFU=TRUE`.

Value

Modified data.frame, all time-varying missing variables after the EFU outcome specified in LTCF are forward imputed with their last available non-missing value.

Details

The default behavior of the sim function consists in setting all nodes that temporally follow an EFU node whose simulated value is 1 to missing (i.e., NA). The argument LTCF of the sim function can however be used to change this default behavior and impute some of these missing values with last time point value carried forward (LTCF). More specifically, only the missing values of time-varying nodes (i.e., those with non-missing t argument) that follow the end of follow-up event encoded by the EFU node specified by the LTCF argument will be imputed. One can use the function doLTCF to apply the last time point value carried forward (LTCF) imputation to an existing simulated dataset obtained from the function sim that was called with its default imputation setting (i.e., with no LTCF argument). Illustration of the use of the LTCF imputation functionality are provided in the package vignette.

The first example below shows the default data format of the sim function after an end of the follow-up event and how this behavior can be modified to generate data with LTCF imputation by either using the LTCF argument of the sim function or by calling the doLTCF function. The second example demonstrates how to use the doLTCF function to perform LTCF imputation on already existing data simulated with the sim function based on its default non-imputation behavior.

Examples

t_end <- 10
lDAG <- DAG.empty()
lDAG <- lDAG +
	node(name = "L2", t = 0, distr = "rconst", const = 0) +
	node(name = "A1", t = 0, distr = "rconst", const = 0) +
	node(name = "L2", t = 1:t_end, distr = "rbern",
 	prob = ifelse(A1[t - 1]  ==  1, 0.1,
 			ifelse(L2[t-1] == 1, 0.9,
        min(1,0.1 + t/t_end)))) +
	node(name = "A1", t = 1:t_end, distr = "rbern",
 	prob = ifelse(A1[t - 1]  ==  1, 1,
 			 ifelse(L2[0] == 0, 0.3,
			  ifelse(L2[0] == 0, 0.1,
			   ifelse(L2[0] == 1, 0.7, 0.5))))) +
	node(name = "Y", t = 1:t_end, distr = "rbern",
 	prob = plogis(-6.5 + 4 * L2[t] + 0.05 * sum(I(L2[0:t] == rep(0,(t + 1))))),
 	EFU = TRUE)
lDAG <- set.DAG(lDAG)
#---------------------------------------------------------------------------------------
# EXAMPLE 1. No forward imputation.
#---------------------------------------------------------------------------------------
Odat.wide <- sim(DAG = lDAG, n = 1000, rndseed = 123)
Odat.wide[c(21,47), 1:18]
Odat.wideLTCF <- sim(DAG = lDAG, n = 1000, LTCF = "Y", rndseed = 123)
Odat.wideLTCF[c(21,47), 1:18]
#---------------------------------------------------------------------------------------
# EXAMPLE 2. With forward imputation.
#---------------------------------------------------------------------------------------
Odat.wideLTCF2 <- doLTCF(data = Odat.wide, LTCF = "Y")
Odat.wideLTCF2[c(21,47), 1:18]
# all.equal(Odat.wideLTCF, Odat.wideLTCF2)

[Package simcausal version 0.5.6 Index]