R: Plot Comparison of Multiple Data Frames on a Bivariate Level

plot_biv_compare {sampcompR}

R Documentation

Plot Comparison of Multiple Data Frames on a Bivariate Level

Description

Plot a object generated by biv_compare function.

Usage

plot_biv_compare(
  biv_data_object,
  plot_title = NULL,
  plots_label = NULL,
  p_value = NULL,
  varlabels = NULL,
  mar = c(0, 0, 0, 0),
  note = FALSE,
  grid = "white",
  diff_perc = TRUE,
  diff_perc_size = 4.5,
  perc_diff_transparance = 0,
  gradient = FALSE,
  sum_weights = NULL,
  missings_x = TRUE,
  order = NULL,
  breaks = NULL,
  colors = NULL,
  ncol_facet = 3
)

Arguments

`biv_data_object`	A object generated by the biv_compare function.
`plot_title`	A character string containing the title of the plot.
`plots_label`	A character string or vector of character strings containing the new labels of the data frames that are used in the plot.
`p_value`	A number between 0 and one to determine the maximum significance niveau.
`varlabels`	A character string or vector of character strings containing the new labels of variables that are used in the plot.
`mar`	A vector that determines the margins of the plot.
`note`	If `note = TRUE`, a note will be displayed to describe the plot.
`grid`	A character string, that determines the color of the lines between the tiles of the heatmap.
`diff_perc`	If `TRUE` a percental measure of difference between `dfs` and benchmarks is displayed in the plot.
`diff_perc_size`	A number to determine the size of the displayed percental difference between surveys in the plot.
`perc_diff_transparance`	A number to determine the transparency of the displayed percental-difference between surveys in the plot.
`gradient`	If gradient = TRUE, colors in the heatmap will be more or less transparent, depending on the difference in Pearson's r of the data frames of comparison.
`sum_weights`	A vector containing information for every variable to weigh them in the displayed percental difference calculation. It can be used if some variables are over- or underrepresented in the analysis.
`missings_x`	If TRUE, missing pairs in the plot will be marked with an X.
`order`	A character vector to determine in which order the variables should be displayed in the plot.
`breaks`	A vector to label the color scheme in the legend.
`colors`	A vector to determine the colors in the plot.
`ncol_facet`	Number of columns used in faced_wrap() for the plots.

Details

The plot shows a heatmap of a correlation matrix, where the colors are determined by the similarity of the Pearson's r value in both sets of respondents. Leaving default breaks and colors,

Same (green) indicates, that the Pearson's r correlation is not significant > 0 in the related data frame or benchmark or the Pearson's r correlations are not significant different, between data frame and benchmark.
Small Diff (yellow) indicates that the Pearson's r correlation is significant > 0 in the related data frame or benchmark and the Pearson's r correlations are significant different, between data frame and benchmark.
Large Diff (red) indicates, that the same coditions of yellow are fulfilled, and the correlations are either in opposite directions,or one is double the size of the other.

Value

A object generated with the help of ggplot2::ggplot2(), used to visualize the differences between the data frames and benchmarks.

Examples


## Get Data for comparison
require(wooldridge)
card<-wooldridge::card

south <- card[card$south==1,]
north <- card[card$south==0,]
black <- card[card$black==1,]
white <- card[card$black==0,]

## use the function to plot the data 
bivar_data<-sampcompR::biv_compare(dfs = c("north","white"),
                                   benchmarks = c("south","black"),
                                   variables= c("age","educ","fatheduc","motheduc","wage","IQ"),
                                   data=TRUE)
                        
sampcompR::plot_biv_compare(bivar_data)

[Package sampcompR version 0.2.1 Index]