simulate {missCompare}R Documentation

Simulation of matrix with no missingness

Description

simulate simulates a clean matrix with no missingness based on the original data structure where all variables have the same mean and standard deviation and are normally distributed.

Usage

simulate(rownum, colnum, cormat, meanval = 0, sdval = 1)

Arguments

rownum

Number of rows (samples) in the original dataframe (Rows output from the get_data function)

colnum

Number of rows (variables) in the original dataframe (Columns output from the get_data function)

cormat

Correlation matrix of the original dataframe (Corr_matrix output from the get_data function)

meanval

Desired mean value for the simulated variables, default = 0

sdval

Desired standard deviation value for the simulated variables, default = 1

Details

This function requires the metadata from the original dataframe and simulates a matrix with no missingness with the same number of rows and columns and with the same or very similar correlation matrix as observed in the original dataframe. When the correlation matrix is a non positive definitive matrix, the nearPD function estimates the closest positive definitive matrix. Outputs from the function makes it easy to compare the original correlation matrix with the nearPD correlation matrix. In the simulated matrix all variables have normal distribution and fixed mean and standard deviation. This matrix will be subsequently used for spiking in missing values and for the testing of various missing data imputation algorithms.

Value

Simulated_matrix

Simulated matrix with no missingness. The simulated matrix resembles the original dataframe in size and correlation structure, but has normally distributed variables with fixed means and SDs

Original_correlation_sample

Sample of the original correlation structure (for comparison)

NearPD_correlation_sample

Sample of the nearPD (nearest positive definitive matrix) correlation structure of the simulated matrix (for comparison)

Examples

cleaned <- clean(clindata_miss, missingness_coding = -9)
metadata <- get_data(cleaned)
simulated <- simulate(rownum = metadata$Rows, colnum = metadata$Columns,
cormat = metadata$Corr_matrix)


[Package missCompare version 1.0.3 Index]