ggmca {ggfacto}R Documentation

Readable and Interactive graph for multiple correspondence analysis

Description

A readable, complete and beautiful graph for multiple correspondence analysis made with FactoMineR::MCA. Interactive tooltips, appearing when hovering near points with mouse, allow to keep in mind many important data (tables of active variables, and additional chosen variables) while reading the graph. Profiles of answers (from the graph of "individuals") are drawn in the back, and can be linked to FactoMineR::HCPC classes. Since it is made in the spirit of ggplot2, it is possible to change theme or add another plot elements with +. Then, interactive tooltips won't appear until you pass the result through ggi. Step-by-step functions : use ggmca_data to get the data frames with every parameter in a MCA printing, then modify, and pass to ggmca_plot to draw the graph.

Usage

ggmca(
  res.mca,
  dat,
  sup_vars,
  active_tables,
  tooltip_vars_1lv,
  tooltip_vars,
  axes = c(1, 2),
  axes_names = NULL,
  axes_reverse = NULL,
  type = c("text", "labels", "points", "numbers", "facets"),
  color_groups = "^.{0}",
  cah_color_groups = "^.+$",
  keep_levels,
  discard_levels,
  cleannames = TRUE,
  profiles = FALSE,
  profiles_tooltip_discard = "^Not |^No |^Pas |^Non ",
  cah,
  max_profiles = 5000,
  alpha_profiles = 0.7,
  color_profiles = TRUE,
  base_profiles_color = "#aaaaaa",
  text_repel = FALSE,
  title,
  actives_in_bold = NULL,
  sup_in_italic = FALSE,
  ellipses = NULL,
  xlim,
  ylim,
  out_lims_move = FALSE,
  shift_colors = 0,
  colornames_recode,
  scale_color_light = material_colors_light(),
  scale_color_dark = material_colors_dark(),
  text_size = 3.5,
  size_scale_max = 4,
  dist_labels = c("auto", 0.04),
  right_margin = 0,
  use_theme = TRUE,
  get_data = FALSE
)

ggmca_data(
  res.mca,
  dat,
  sup_vars,
  active_tables,
  tooltip_vars_1lv,
  tooltip_vars,
  color_groups = "^.{0}",
  cah_color_groups = "^.+$",
  keep_levels,
  discard_levels,
  cleannames = TRUE,
  profiles = FALSE,
  profiles_tooltip_discard = "^Pas |^Non |^Not |^No ",
  cah,
  max_profiles = 5000
)

ggmca_plot(
  data,
  axes = c(1, 2),
  axes_names = NULL,
  axes_reverse = NULL,
  type = c("text", "points", "labels", "active_vars_only", "numbers", "facets"),
  text_repel = FALSE,
  title,
  ellipses = NULL,
  actives_in_bold = NULL,
  sup_in_italic = FALSE,
  xlim,
  ylim,
  out_lims_move = FALSE,
  color_profiles = TRUE,
  base_profiles_color = "#aaaaaa",
  alpha_profiles = 0.7,
  shift_colors = 0,
  colornames_recode,
  scale_color_light = material_colors_light(),
  scale_color_dark = material_colors_dark(),
  text_size = 3.5,
  size_scale_max = 4,
  dist_labels = c("auto", 0.04),
  right_margin = 0,
  use_theme = TRUE,
  get_data = FALSE
)

Arguments

res.mca

An object created with FactoMineR::MCA.

dat

The data in which to find the supplementary variables, etc.

sup_vars

A character vectors of supplementary qualitative variables to print (they don't need to be passed in MCA before).

active_tables

Should colored crosstables be added in interactive tooltips ? 'active_tables = "sup"' crosses each 'sup_vars' with active variables. 'active_tables = "active"' crosses each active_variables with the other ones, giving results closely related with the burt table used to calculate multiple correspondance analysis. It may take time to calculate with many variables. 'active_tables = c("active", "sup")' do both. In tooltips, percentages are colored in blue when spread from mean is positive (over-representations), and in red when spread from mean is negative (under-representations), like in tab with 'color = "diff"'.

tooltip_vars_1lv

A character vectors of variables, whose first level (if character/factor) or weighted_mean (if numeric) will be added at the top of interactive tooltips.

tooltip_vars

A character vector of variables (character/factors), whose complete levels will be added at the bottom of interactive tooltips.

axes

The axes to print, as a numeric vector of length 2.

axes_names

Names of all the axes (not just the two selected ones), as a character vector.

axes_reverse

Possibility to reserve the coordinates of the axes by providing a numeric vector : '1' to invert left and right ; '2' to invert up and down ; '1:2' to invert both.

type

Determines the way sup_vars are printed.

  • "text" : colored text

  • "points" : colored points with text legends

  • "labels" : colored labels

  • "active_vars_only" : no sup_vars

  • "numbers" : colored labels of prefix numbers, with small names

  • "facets" : one graph of profiles of answer for each levels of the first sup_vars. A different color is used for each.

color_groups

By default, there is one color group for all the levels of each 'sup_vars'. It is possible to color 'sup_vars' with groups created upon their levels with str_extract and regexes. For exemple, 'color_groups = "^."' makes the groups upon the first character of each levels (uselful when their begin by numbers). color_groups = "^.{3}" upon the first three characters. color_groups = "NB.+$" takes anything between the '"NB"' and the end of levels names, etc.

cah_color_groups

Color groups for the 'cah' variable (HCPC clusters).

keep_levels

A character vector of variables levels to keep : others will be discarded.

discard_levels

A character vector of variables levels to discard.

cleannames

Set to TRUE to clean levels names, by removing prefix numbers like "1-", and text in parentheses.

profiles

When set to TRUE, profiles of answers are drawn in the back of the graph with light-grey points. When hovering with mouse in the interactive version (passed in ggi), the answers of individuals to active variables will appears. If cah is provided, to hover near one point will color all the points of the same HCPC class.

profiles_tooltip_discard

A regex pattern to remove useless levels among interactive tooltips for profiles of answers (ex. : levels expressing "no" answers).

cah

A HCPC clusters variable made with HCPC on 'res.mca', to link the answers-profiles points who share the same HCPC class (will be colored the same color and linked at mouse hover).

max_profiles

The maximum number of profiles points to print. Default to 5000.

alpha_profiles

The alpha (transparency, between 0 and 1) for profiles of answer.

color_profiles

By default, if cah is provided, profiles are colored based on cah levels (HCPC clusters). Set do FALSE to avoid this behaviour. You can also give a character vector with only some of the levels of the 'cah' variable .

base_profiles_color

The base color for answers profiles. Default to gray. Set to 'NULL' to discard profiles. With 'color_profiles', set to 'NULL' to discard the non-colored profiles.

text_repel

When TRUE the graph is not interactive anymore, but the resulting image is better to print because points and labels don't overlaps. It uses ggrepel::geom_text_repel.

title

The title of the graph.

actives_in_bold

Set to 'TRUE' to set active variables in bold font (and sup variables in plain).

sup_in_italic

Set to 'TRUE' to set sup variables in italics.

ellipses

Set to a number between 0 and 1 to draw a concentration ellipse for each level of the first sup_vars. 0.95 draw ellipses containing 95 individuals of each category. 0.5 draw median-ellipses, containing half the individuals of each category. Note that, if 'max_profiles' is provided, ellipses won't be made with all individuals.

xlim, ylim

Horizontal and vertical axes limits, as double vectors of length 2.

out_lims_move

When TRUE, the points out of xlim or ylim are not removed, but moved at the edges of the graph.

shift_colors

Change colors of the sup_vars points.

colornames_recode

A named character vector with fct_recode style to rename the levels of the color variable if needed (levels used for colors are printed in console message whenever the function is used).

scale_color_light

A scale color for sup vars points

scale_color_dark

A scale color for sup vars texts

text_size

Size of text.

size_scale_max

Size of points.

dist_labels

When type = points, the distance of labels from points.

right_margin

A margin at the right, in cm. Useful to read tooltips over points placed at the right of the graph without formatting problems.

use_theme

By default, a specific ggplot2 theme is used. Set to FALSE to customize your own theme.

get_data

Returns the data frame to create the plot instead of the plot itself.

data

A list of data frames made with ggmca_data.

Value

A ggplot object to be printed in the 'RStudio' Plots pane. Possibility to add other gg objects with +. Sending the result through ggi will draw the interactive graph in the Viewer pane using ggiraph.

A list containing the data frames to pass to ggmca_plot.

A ggplot object.

Functions

Examples


data(tea, package = "FactoMineR")
res.mca <- MCA2(tea, active_vars = 1:18)

# Interactive graph for multiple correspondence analysis :
res.mca |>
  ggmca(tea, sup_vars = c("SPC"), ylim = c(NA, 1.2), text_repel = TRUE) |>
  ggi() #to make the graph interactive

# Interactive graph with access to all crosstables between active variables (burt table).
#  Spread from mean are colored and, usually, points near the middle will have less
#  colors, and points at the edges will have plenty. It may takes time to print, but
#  helps to interpret the MCA in close proximity with the underlying data.
res.mca |>
  ggmca(tea, ylim = c(NA, 1.2), active_tables = "active", text_repel = TRUE) |>
  ggi()

# Graph with colored HCPC clusters
cah <- FactoMineR::HCPC(res.mca, nb.clust = 6, graph = FALSE)
tea$clust <- cah$data.clust$clust
ggmca(res.mca, tea, cah = "clust", profiles = TRUE, text_repel = TRUE)

# Concentration ellipses for each levels of a supplementary variable :
ggmca(res.mca, tea, sup_vars = "SPC", ylim = c(NA, 1.2),
  ellipses = 0.5, text_repel = TRUE, profiles = TRUE)

# Graph of profiles of answer for each levels of a supplementary variable :
ggmca(res.mca, tea, sup_vars = "SPC", ylim = c(NA, 1.2),
  type = "facets", ellipses = 0.5, profiles = TRUE)


[Package ggfacto version 0.3.0 Index]