R: Completeness Heatmap

completeness_heatmap {eHDPrep}

R Documentation

Completeness Heatmap

Description

Produces a heatmap visualising completeness across a dataset.

Usage

completeness_heatmap(
  data,
  id_var,
  annotation_tbl = NULL,
  method = 1,
  show_rownames = FALSE,
  ...
)

Arguments

`data`	Data frame to be analysed.
`id_var`	Character constant of row identifier variable name.
`annotation_tbl`	Data frame containing variable annotation data. Column 1 should contain variable names, column 2 should contain an annotation label.
`method`	Integer between 1 and 3. Default: 1. See Details for more information.
`show_rownames`	Boolean. Should rownames be shown. Default: False.
`...`	Parameters to be passed to `pheatmap`.

Details

Method 1: Missing values are numerically encoded with a highly negative number, numerically distant from all values in data, using distant_neg_val. Values in categorical variables are replaced with the number of unique values in the variable. Clustering uses these values. Cells are coloured by presence (yellow = missing; blue = present).
Method 2: Same as Method 1 but cells are coloured by values used to cluster.
Method 3: Values in data are encoded as Boolean values for clustering (present values = 1; missing values = 0). Cells are coloured by presence (yellow = missing; blue = present).

Value

completeness heatmap

Note

See examples of how to plot using plot.new(). This is ensure a new plot is created for the heatmap

References

Kolde R (2019). _pheatmap: Pretty Heatmaps_. R package version 1.0.12, <https://CRAN.R-project.org/package=pheatmap>.

Examples

data(example_data)

# heatmap without variable category annotations:
hm <- completeness_heatmap(example_data,patient_id)
plot.new() # ensure new plot is created
hm


# heatmap with variable category annotations:
## create a dataframe containing variable annotations
tibble::tribble(~"var", ~"datatype",
"patient_id", "id",
"tumoursize", "numeric",
"t_stage", "ordinal_tstage",
"n_stage", "ordinal_nstage",
"diabetes", "factor",
"diabetes_type", "ordinal",
"hypertension", "factor",
"rural_urban", "factor",
"marital_status", "factor",
"SNP_a", "genotype",
"SNP_b", "genotype",
"free_text", "freetext") -> data_types

hm <- completeness_heatmap(example_data,patient_id, annotation_tbl = data_types)
plot.new() # ensure new plot is created
hm

[Package eHDPrep version 1.3.3 Index]