com_item_missingness {dataquieR} | R Documentation |
Summarize missingness columnwise (in variable)
Description
Item-Missingness (also referred to as item nonresponse (De Leeuw et al. 2003)) describes the missingness of single values, e.g. blanks or empty data cells in a data set. Item-Missingness occurs for example in case a respondent does not provide information for a certain question, a question is overlooked by accident, a programming failure occurs or a provided answer were missed while entering the data.
Usage
com_item_missingness(
study_data,
meta_data,
resp_vars = NULL,
label_col,
show_causes = TRUE,
cause_label_df,
include_sysmiss = TRUE,
threshold_value,
suppressWarnings = FALSE,
assume_consistent_codes = TRUE,
expand_codes = assume_consistent_codes,
drop_levels = TRUE,
expected_observations = c("HIERARCHY", "ALL", "SEGMENT"),
pretty_print = lifecycle::deprecated()
)
Arguments
study_data |
data.frame the data frame that contains the measurements |
meta_data |
data.frame the data frame that contains metadata attributes of study data |
resp_vars |
variable list the name of the measurement variables |
label_col |
variable attribute the name of the column in the metadata with labels of variables |
show_causes |
logical if TRUE, then the distribution of missing codes is shown |
cause_label_df |
data.frame missing code table. If missing codes have labels the respective data frame can be specified here or in the metadata as assignments, see cause_label_df |
include_sysmiss |
logical Optional, if TRUE system missingness (NAs) is evaluated in the summary plot |
threshold_value |
numeric from=0 to=100. a numerical value ranging from 0-100 |
suppressWarnings |
logical warn about consistency issues with missing and jump lists |
assume_consistent_codes |
logical if TRUE and no labels are given and the same missing/jump code is used for more than one variable, the labels assigned for this code are treated as being be the same for all variables. |
expand_codes |
logical if TRUE, code labels are copied from other variables, if the code is the same and the label is set somewhere |
drop_levels |
logical if TRUE, do not display unused missing codes in the figure legend. |
expected_observations |
enum HIERARCHY | ALL | SEGMENT. If ALL, all
observations are expected to comprise
all study segments. If SEGMENT, the
|
pretty_print |
logical deprecated. If you want to have a human
readable output, use |
Value
a list with:
-
SummaryTable
: data frame about item missingness per response variable -
SummaryData
: data frame about item missingness per response variable formatted for user -
SummaryPlot
: ggplot2 heatmap plot, if show_causes was TRUE -
ReportSummaryTable
: data frame underlyingSummaryPlot
ALGORITHM OF THIS IMPLEMENTATION:
Lists of missing codes and, if applicable, jump codes are selected from the metadata
The no. of system missings (NA) in each variable is calculated
The no. of used missing codes is calculated for each variable
The no. of used jump codes is calculated for each variable
Two result dataframes (1: on the level of observations, 2: a summary for each variable) are generated
-
OPTIONAL: if
show_causes
is selected, one summary plot for allresp_vars
is provided