MAR.data {MonteCarloSEM}R Documentation

This function inserts missingness (Missing at Random - MAR) into the given data sets.

Description

Missing values (MAR) will be added to the generated data sets (Generated by sim.skewed() or sim.normal() functions). Under MAR, the missingness was associated with the values of the variable in the data set except itself. If baseV parameter was not given, two different and random variables in the data set are selected, and the missing values are assigned based on the mean of the two variables on the selected item. For example, if the data has 8 items and the second item will be assigned MAR values, two items among the item 1, 3, 4, 5, 6, 7, and 8 were selected randomly, let’s say items 5 and 7. The mean of the items was then calculated and the values were sorted. Then, based on the given percent of missingness, 90 percent of the missing values were selected from the top. The remaining 10 percent of missing values were assigned from the rest of the variable. For example, let’s say the sample size was 300, and 20 percent of missingness was wanted (missing count: 300x20 The missing values are shown as "NA" in the data files. The new data sets which have missing values will be saved as a different data file. In each data file, the first column shows sample numbers. The second and the other columns show actual data sets for each item. There will be a file named "MAR_List.dat". The file includes the names of the data sets which has missing values in it. Besides, a file named “Model_MAR_relations.dat” shows which item was associated with which random items that were used for the MAR calculation.

Usage

MAR.data(
  misg = NULL,
  baseV = NULL,
  perct = 10,
  dataList = "Data_List.dat",
  f.loc
)

Arguments

misg

A vector of 0s and 1s for each item. 0 indicates non-missing and 1 indicates items which have missing values. If misg is not indicated all items are considered as missing.

baseV

A list of items that MAR will be calculated based on. It has to be match with the misg parameter. If it is not given, two random items (except the variable itself) will be selected and used to get MAR values for the given items.

perct

The percent of missingness. The default is 10 percent.

dataList

List of the names of data sets generated earlier either with the package functions or any other software.

f.loc

File location. It indicates where the simulated data sets and "dataList" are located.

Author(s)

Fatih Orcan

Examples


#   Data needed to be generated at the first step.

fc<-fcors.value(nf=3, cors=c(1,.5,.6,.5,1,.4,.6,.4,1))
fl<-loading.value(nf=3, fl.loads=c(.5,.5,.5,0,0,0,0,0,0,0,0,.6,.6,.6,0,0,0,0,0,0,0,0,.4,.4))
floc<-tempdir()
sim.normal(nd=10, ss=100, fcors=fc, loading<-fl,  f.loc=floc)

 #  Missing values were added at the second step.

mis.items<-c(1,0,1,1,0,0,0,0)
bV<-list(c(0,0,0,0,0,0,1,1),NA,c(0,0,0,0,0,1,1,0),c(0,0,0,0,0,1,1,1), NA,NA,NA,NA)
dl<-"Data_List.dat"  # should be located in the working directory.
MAR.data(misg = mis.items, baseV=bV, perct = 20, dataList = dl, f.loc=floc )

[Package MonteCarloSEM version 0.0.8 Index]