R: Worker function to make long form data set needed for CVTMLE...

.make_long_data_nested_cv {nlpred}

R Documentation

Worker function to make long form data set needed for CVTMLE targeting step when nested cv is used

Description

Worker function to make long form data set needed for CVTMLE targeting step when nested cv is used

Usage

.make_long_data_nested_cv(
  x,
  prediction_list,
  folds,
  gn,
  update = FALSE,
  epsilon_0 = 0,
  epsilon_1 = 0,
  tol = 0.001
)

Arguments

`x`	The outer validation fold
`prediction_list`	The full prediction list
`folds`	Vector of CV folds
`gn`	An estimate of the marginal dist. of Y
`update`	Boolean of whether this is called for initial construction of the long data set or as part of the targeting loop. If the former, cross-validated empirical "density" estimates are used. If the latter these are derived from the targeted cdf.
`epsilon_0`	If `update = TRUE`, a vector of TMLE fluctuation parameter estimates used to add the CDF and PDF of Psi(X) to the data set
`epsilon_1`	Ditto above
`tol`	A truncation level when taking logit transformations.

Value

A long form data list of a particular set up. Columns are named id (multiple per obs. in validation sample), u (if Yi = 0, these are the unique values of psi(x) in the inner validation samples for psi fit on inner training samples for obs with Y = 1, if Yi = 1, these are values of psi(x) in the inner validation samples for psi fit on inner training samples for obs. with Y = 0), Yi (this id's value of Y), Fn (cross-validation estimated value of the cdf of psi(X) given Y = Yi in the training sample), dFn (cross-validated estimate of the density of psi(X) given Y = (1-Yi) in the training sample), psi (the value of this observations Psihat(P_n,B_n^0)), gn (estimate of marginal of Y e.g., computed in whole sample), outcome (indicator that psix <= u), logit_Fn (the cdf estimate on the logit scale, needed for offset in targeting model).

[Package nlpred version 1.0.1 Index]