na.test {misty} | R Documentation |
Little's Missing Completely at Random (MCAR) Test
Description
This function performs Little's Missing Completely at Random (MCAR) test
Usage
na.test(..., data = NULL, digits = 2, p.digits = 3, as.na = NULL, write = NULL,
append = TRUE,check = TRUE, output = TRUE)
Arguments
... |
a matrix or data frame with incomplete data, where missing
values are coded as |
data |
a data frame when specifying one or more variables in the
argument |
digits |
an integer value indicating the number of decimal places to be used for displaying results. |
p.digits |
an integer value indicating the number of decimal places to be used for displaying the p-value. |
as.na |
a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis. |
write |
a character string naming a text file with file extension
|
append |
logical: if |
check |
logical: if |
output |
logical: if |
Details
Little (1988) proposed a multivariate test of Missing Completely at Random (MCAR)
that tests for mean differences on every variable in the data set across subgroups
that share the same missing data pattern by comparing the observed variable means
for each pattern of missing data with the expected population means estimated using
the expectation-maximization (EM) algorithm (i.e., EM maximum likelihood estimates).
The test statistic is the sum of the squared standardized differences between the
subsample means and the expected population means weighted by the estimated
variance-covariance matrix and the number of observations within each subgroup
(Enders, 2010). Under the null hypothesis that data are MCAR, the test statistic
follows asymptotically a chi-square distribution with \sum k_j - k
degrees
of freedom, where k_j
is the number of complete variables for missing data
pattern j
, and k
is the total number of variables. A statistically
significant result provides evidence against MCAR.
Note that Little's MCAR test has a number of problems (see Enders, 2010). First, the test does not identify the specific variables that violates MCAR, i.e., the test does not identify potential correlates of missingness (i.e., auxiliary variables). Second, the test is based on multivariate normality, i.e., under departure from the normality assumption the test might be unreliable unless the sample size is large and is not suitable for categorical variables. Third, the test investigates mean differences assuming that the missing data pattern share a common covariance matrix, i.e., the test cannot detect covariance-based deviations from MCAR stemming from a Missing at Random (MAR) or Missing Not at Random (MNAR) mechanism because MAR and MNAR mechanisms can also produce missing data subgroups with equal means. Fourth, simulation studies suggest that Little's MCAR test suffers from low statistical power, particularly when the number of variables that violate MCAR is small, the relationship between the data and missingness is weak, or the data are MNAR (Thoemmes & Enders, 2007). Fifth, the test can only reject, but cannot prove the MCAR assumption, i.e., a statistically not significant result and failing to reject the null hypothesis of the MCAR test does not prove the null hypothesis that the data is MCAR. Finally, under the null hypothesis the data are actually MCAR or MNAR, while a statistically significant result indicates that missing data are MAR or MNAR, i.e., MNAR cannot be ruled out regardless of the result of the test.
This function is based on the prelim.norm
function in the norm
package which can handle about 30 variables. With more than 30 variables
specified in the argument x
, the prelim.norm
function might run
into numerical problems leading to results that are not trustworthy. In this
case it is recommended to reduce the number of variables specified in the argument
x
. If the number of variables cannot be reduced, it is recommended to
use the LittleMCAR
function in the BaylorEdPsych package which can
deal with up to 50 variables. However, this package was removed from the CRAN
repository and needs to be obtained from the archive along with the mvnmle package
which is needed for using the LittleMCAR
function. Note that the
mcar_test
function in the naniar package is also based on the
prelim.norm
function which results are not trustworthy whenever the warning
message In norm::prelim.norm(data) : NAs introduced by coercion to integer range
is printed on the console.
Value
Returns an object of class misty.object
, which is a list with following
entries:
call |
function call |
type |
type of analysis |
data |
matrix or data frame specified in |
args |
specification of function arguments |
result |
result table |
Note
Code is adapted from the R function by Eric Stemmler: tinyurl.com/r-function-for-MCAR-test
Author(s)
Takuya Yanagida takuya.yanagida@univie.ac.at
References
Enders, C. K. (2010). Applied missing data analysis. Guilford Press.
Thoemmes, F., & Enders, C. K. (2007, April). A structural equation model for testing whether data are missing completely at random. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.
Little, R. J. A. (1988). A test of Missing Completely at Random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198-1202. https://doi.org/10.2307/2290157
See Also
as.na
, na.as
, na.auxiliary
,
na.coverage
, na.descript
, na.indicator
,
na.pattern
, na.prop
.
Examples
# Example 1a: Conduct Little's MCAR test
na.test(airquality)
# Example b: Alternative specification using the 'data' argument,
na.test(., data = airquality)
## Not run:
# Example 2: Write results into a text file
na.test(airquality, write = "NA_Test.txt")
## End(Not run)