simulate {missCompare} | R Documentation |
Simulation of matrix with no missingness
Description
simulate
simulates a clean matrix with no missingness based on the original data structure
where all variables have the same mean and standard deviation and are normally distributed.
Usage
simulate(rownum, colnum, cormat, meanval = 0, sdval = 1)
Arguments
rownum |
Number of rows (samples) in the original dataframe (Rows output from the |
colnum |
Number of rows (variables) in the original dataframe (Columns output from the |
cormat |
Correlation matrix of the original dataframe (Corr_matrix output from the |
meanval |
Desired mean value for the simulated variables, default = 0 |
sdval |
Desired standard deviation value for the simulated variables, default = 1 |
Details
This function requires the metadata from the original dataframe and simulates a matrix with no missingness with the same number of rows and columns and with the same or very similar correlation matrix as observed in the original dataframe. When the correlation matrix is a non positive definitive matrix, the nearPD function estimates the closest positive definitive matrix. Outputs from the function makes it easy to compare the original correlation matrix with the nearPD correlation matrix. In the simulated matrix all variables have normal distribution and fixed mean and standard deviation. This matrix will be subsequently used for spiking in missing values and for the testing of various missing data imputation algorithms.
Value
Simulated_matrix |
Simulated matrix with no missingness. The simulated matrix resembles the original dataframe in size and correlation structure, but has normally distributed variables with fixed means and SDs |
Original_correlation_sample |
Sample of the original correlation structure (for comparison) |
NearPD_correlation_sample |
Sample of the nearPD (nearest positive definitive matrix) correlation structure of the simulated matrix (for comparison) |
Examples
cleaned <- clean(clindata_miss, missingness_coding = -9)
metadata <- get_data(cleaned)
simulated <- simulate(rownum = metadata$Rows, colnum = metadata$Columns,
cormat = metadata$Corr_matrix)