R: Imputation Training Data

FI_train {FastImputation}

R Documentation

Imputation Training Data

Description

Larger simulated dataset drawn from the same distribution as FI_test and FI_true and used to train the imputation algorithm. 5% of the values are missing. Used with TrainFastImputation.

Usage

data(FI_train)

Format

A data frame with 9 variables and 10000 observations.

user_id_1: Sequential user ids
bounded_below_2: Multivariate normal, transformed using exp(x)
unbounded_3: Multivariate normal
unbounded_4: Multivariate normal
bounded_above_5: Multivariate normal, transformed using -exp(x)
bounded_above_and_below_6: Multivariate normal, transformed using pnorm(x)
unbounded_7: Multivariate normal
unbounded_8: Multivariate normal
categorical_9: "A" if the first of three multivariate normal draws is greatest; "B" if the second is greatest; "C" if the third is greatest

Author(s)

Stephen R. Haptonstahl srh@haptonstahl.org

Source

All columns start as multivariate normal draws. Columns 2, 5, and 6 are transformed. Column 9 is the result of three multivariate normal columns being interpreted as one-hot encoding of a three-valued categorical variable.

[Package FastImputation version 2.2.1 Index]