compare_info_content {eHDPrep} | R Documentation |
Used to quantify the amount of information loss, if any, which has occurred in a merging procedure between two discrete variables.
compare_info_content(input1, input2, composite)
input1 |
Character vector. First variable to compare |
input2 |
Character vector. Second variable to compare |
composite |
Character vector. Composite variable, resultant of merging
|
The function requires the two discrete variables which have been
merged (input1
and input2
) and the composite variable
(output
). For each input, information content is calculated using
information_content_discrete
along with each input's mutual
information content with the composite variable using
mi_content_discrete
. The function returns a table describing
these measures.
If the mutual information content between an input variable and the composite variable is equal to the information content of the input variable, it is confirmed that all information in the input variable has been incorporated into the composite variable. However, if one or both input variables' information content is not equal to their mutual information with the composite variables, information loss has occurred.
Table containing information content for input1
and
input2
and their mutual information content with composite
.
data(example_data)
require(dplyr)
require(magrittr)
example_data %>%
mutate(diabetes_merged = coalesce(diabetes_type, diabetes)) %>%
select(starts_with("diabetes")) ->
merged_data
compare_info_content(merged_data$diabetes,
merged_data$diabetes_type,
merged_data$diabetes_merged)