MCAR {missCompare}R Documentation

Missing data spike-in in MCAR pattern

Description

MCAR spikes in missingness using missing-completely-at-random (MCAR) pattern

Usage

MCAR(X_hat, MD_pattern, NA_fraction, min_PDM = 10)

Arguments

X_hat

Simulated matrix with no missingness (Simulated_matrix output from the simulate function)

MD_pattern

Missing data pattern in the original dataset (MD_Pattern output from the get_data function)

NA_fraction

Fraction of missingness in the original dataset (Fraction_missingness output from the get_data function)

min_PDM

All patterns with number of observations less than this number will be removed from the missing data generation. This argument is necessary to be carefully set, as the function will fail or generate erroneous missing data patterns with very complicated missing data patterns. The default is 10, but for large datasets this number needs to be set higher to avoid errors. Please select a value based on the min_PDM_thresholds output from the get_data function

Details

This function uses the generated simulated matrix and generates missing datapoints in a missing-completely-at-random pattern for each variable, considering the fraction of missingness for each variable, so potential missing data fraction imbalances between variables in the original data will be retained. The missing data spike-in is completely at random. Please note that after the missing data spike-in, the function will remove rows with 100% missing data.

Value

MCAR_matrix

Matrix with MCAR pre-defined missingness pattern

Summary

Summary of MCAR_matrix including number of missing values per variable

Examples

cleaned <- clean(clindata_miss, missingness_coding = -9)
metadata <- get_data(cleaned)
simulated <- simulate(rownum = metadata$Rows, colnum = metadata$Columns,
cormat = metadata$Corr_matrix)

MCAR(simulated$Simulated_matrix,
    MD_pattern = metadata$MD_Pattern,
    NA_fraction = metadata$Fraction_missingness,
    min_PDM = 10)


[Package missCompare version 1.0.3 Index]