readData {synMicrodata}R Documentation

Read the original datasets

Description

Read the original input datasets to be learned for synthetic data generation. The package allows the input data to have missing values and impute them with the posterior predictive distribution, so no missing values exist in the synthetic data output.

Usage

readData(Y_input, X_input, RandomSeed = 99)

Arguments

Y_input

data.frame consisting of continuous variables of the original data. It should consist only of numeric.

X_input

data.frame consisting of categorical variables of the original data. It should consist only of factor.

RandomSeed

random seed number.

Value

readData returns an object of "readData_passed" class.

An object of class "readData_passed" is a list containing the following components:

n_sample

number of records in the input dataset.

p_Y

number of continuous variables.

Y_mat_std

matrix with standardized values of Y_input, with mean 0 and standard deviation 1.

mean_Y_input

mean vectors of original Y_input.

sd_Y_input

standard deviation vectors of original Y_input.

NA_Y_mat

matrix indicating missing values in Y_input.

p_X

number of categorical variables.

D_l_vec

numbers of levels of each categorical variable.

X_mat_std

matrix with the numeric-transformed values of X_input.

levels_X_input

list of levels of each categorical variable.

NA_X_mat

matrix indicating missing values in X_input.

var_names

list containing variable names of X_input and Y_input.

orig_data

original dataset.

See Also

multipleSyn, createModel


[Package synMicrodata version 2.0.0 Index]