ggfreqScatter {Hmisc}  R Documentation 
Frequency Scatterplot
Description
Uses ggplot2
to plot a scatterplot or dotlike chart for the case
where there is a very large number of overlapping values. This works
for continuous and categorical x
and y
. For continuous
variables it serves the same purpose as hexagonal binning. Counts for
overlapping points are grouped into quantile groups and level of
transparency and rainbow colors are used to provide count information.
Instead, you can specify stick=TRUE
not use color but to encode
cell frequencies
with the height of a black line ycentered at the middle of the bins.
Relative frequencies are not transformed, and the maximum cell
frequency is shown in a caption. Every point with at least a
frequency of one is depicted with a fullheight light gray vertical
line, scaled to the above overall maximum frequency. In this way to
relative frequency is to proportion of these light gray lines that are
black, and one can see points whose frequencies are too low to see the
black lines.
The result can also be passed to ggplotly
. Actual cell
frequencies are added to the hover text in that case using the
label
ggplot2
aesthetic.
Usage
ggfreqScatter(x, y, by=NULL, bins=50, g=10, cuts=NULL,
xtrans = function(x) x,
ytrans = function(y) y,
xbreaks = pretty(x, 10),
ybreaks = pretty(y, 10),
xminor = NULL, yminor = NULL,
xlab = as.character(substitute(x)),
ylab = as.character(substitute(y)),
fcolors = viridis::viridis(10), nsize=FALSE,
stick=FALSE, html=FALSE, prfreq=FALSE, ...)
Arguments
x 
xvariable 
y 
yvariable 
by 
an optional vector used to make separate plots for each
distinct value using 
bins 
for continuous 
g 
number of quantile groups to make for frequency counts. Use

cuts 
instead of using 
xtrans , ytrans 
functions specifying transformations to be made before binning and plotting 
xbreaks , ybreaks 
vectors of values to label on axis, on original scale 
xminor , yminor 
values at which to put minor tick marks, on original scale 
xlab , ylab 
axis labels. If not specified and variable has a

fcolors 

nsize 
set to 
stick 
set to 
html 
set to 
prfreq 
set to 
... 
arguments to pass to 
Value
a ggplot
object
Author(s)
Frank Harrell
See Also
Examples
require(ggplot2)
set.seed(1)
x < rnorm(1000)
y < rnorm(1000)
count < sample(1:100, 1000, TRUE)
x < rep(x, count)
y < rep(y, count)
# color=alpha=NULL below makes loess smooth over all points
g < ggfreqScatter(x, y) + # might add g=0 if using plotly
geom_smooth(aes(color=NULL, alpha=NULL), se=FALSE) +
ggtitle("Using Deciles of Frequency Counts, 2500 Bins")
g
# plotly::ggplotly(g, tooltip='label') # use plotly, hover text = freq. only
# Plotly makes it somewhat interactive, with hover text tooltips
# Instead use varyingheight sticks to depict frequencies
ggfreqScatter(x, y, stick=TRUE) +
labs(subtitle='Relative height of black lines to gray lines
is proportional to cell frequency.
Note that points with even tiny frequency are visable
(gray line with no visible black line).')
# Try with x categorical
x1 < sample(c('cat', 'dog', 'giraffe'), length(x), TRUE)
ggfreqScatter(x1, y)
# Try with y categorical
y1 < sample(LETTERS[1:10], length(x), TRUE)
ggfreqScatter(x, y1)
# Both categorical, larger point symbols, box instead of circle
ggfreqScatter(x1, y1, shape=15, size=7)
# Vary box size instead
ggfreqScatter(x1, y1, nsize=TRUE, shape=15)