ggseqrfplot {ggseqplot} | R Documentation |
Relative Frequency Sequence Plot
Description
Function for rendering sequence index plots with ggplot2
instead of base R's plot
function that is used by
TraMineR::seqrfplot
. Note that ggseqrfplot
uses patchwork
to combine the different components of
the plot. The function and the documentation draw heavily from
TraMineR::seqrf
.
Usage
ggseqrfplot(
seqdata = NULL,
diss = NULL,
k = NULL,
sortv = "mds",
weighted = TRUE,
grp.meth = "prop",
squared = FALSE,
pow = NULL,
seqrfobject = NULL,
border = FALSE,
ylab = NULL,
yaxis = TRUE,
which.plot = "both",
quality = TRUE,
box.color = NULL,
box.fill = NULL,
box.alpha = NULL,
outlier.jitter.height = 0,
outlier.color = NULL,
outlier.fill = NULL,
outlier.shape = 19,
outlier.size = 1.5,
outlier.stroke = 0.5,
outlier.alpha = NULL
)
Arguments
seqdata |
State sequence object (class |
diss |
pairwise dissimilarities between sequences in |
k |
integer specifying the number of frequency groups. When |
sortv |
optional sorting vector of length |
weighted |
Controls if weights (specified in
|
grp.meth |
Character string. One of |
squared |
Logical. Should medoids (and computation of |
pow |
Dissimilarity power exponent (typically 1 or 2) for computation of
pseudo R2 and F. When |
seqrfobject |
object of class |
border |
if |
ylab |
character string specifying title of y-axis. If |
yaxis |
Controls if a y-axis is plotted. When set as |
which.plot |
character string specifying which components of relative
frequency sequence plot should be displayed. Default is |
quality |
specifies if representation quality is shown as figure caption;
default is |
box.color |
specifies color of boxplot borders; default is "black |
box.fill |
specifies fill color of boxplots; default is "white" |
box.alpha |
specifies alpha value of boxplot fill color; default is 1 |
outlier.jitter.height |
if greater than 0 outliers are jittered vertically. If greater than .375 height is automatically adjusted to be aligned with the box width. |
outlier.color , outlier.fill , outlier.shape , outlier.size , outlier.stroke , outlier.alpha |
parameters to change the appearance of the outliers. Uses defaults of
|
Details
This function renders relative frequency sequence plots using either an internal
call of TraMineR::seqrf
or by using an object of
class "seqrf"
generated with TraMineR::seqrf
.
For further details on the technicalities we refer to the excellent documentation
of TraMineR::seqrf
. A detailed account of
relative frequency index plot can be found in the original contribution by
Fasang and Liao (2014).
ggseqrfplot
renders the medoid sequences extracted by
TraMineR::seqrf
with an internal call of
ggseqiplot
. For the box plot depicting the distances to the medoids
ggseqrfplot
uses geom_boxplot
and
geom_jitter
. The latter is used for plotting the outliers.
Note that ggseqrfplot
renders in the box plots analogous to the those
produced by TraMineR::seqrfplot
. Actually,
the box plots produced with TraMineR::seqrfplot
and ggplot2::geom_boxplot
might slightly differ due to differences in the underlying computations of
grDevices::boxplot.stats
and
ggplot2::stat_boxplot
.
Note that ggseqrfplot
uses patchwork
to combine
the different components of the plot. If you want to adjust the appearance of
the composed plot, for instance by changing the plot theme, you should consult
the documentation material of patchwork
.
At this point ggseqrfplot
does not support a grouping option. For
plotting multiple groups, I recommend to produce group specific seqrfobjects or
plots and to arrange them in a common plot using patchwork
.
See Example 6 in the vignette for further details:
vignette("ggseqplot", package = "ggseqplot")
Value
A relative frequency sequence plot using ggplot
.
Author(s)
Marcel Raab
References
Fasang AE, Liao TF (2014). “Visualizing Sequences in the Social Sciences: Relative Frequency Sequence Plots.” Sociological Methods & Research, 43(4), 643–676. doi:10.1177/0049124113506563.
Examples
# Load additional library for fine-tuning the plots
library(patchwork)
# From TraMineR::seqprf
# Defining a sequence object with the data in columns 10 to 25
# (family status from age 15 to 30) in the biofam data set
data(biofam)
biofam.lab <- c("Parent", "Left", "Married", "Left+Marr",
"Child", "Left+Child", "Left+Marr+Child", "Divorced")
# Here, we use only 100 cases selected such that all elements
# of the alphabet be present.
# (More cases and a larger k would be necessary to get a meaningful example.)
biofam.seq <- seqdef(biofam[501:600, 10:25], labels=biofam.lab,
weights=biofam[501:600,"wp00tbgs"])
diss <- seqdist(biofam.seq, method = "LCS")
# Using 12 groups and default MDS sorting
# and original method by Fasang and Liao (2014)
# ... with TraMineR::seqrfplot (weights have to be turned off)
seqrfplot(biofam.seq, weighted = FALSE, diss = diss, k = 12,
grp.meth="first", which.plot = "both")
# ... with ggseqrfplot
ggseqrfplot(biofam.seq, weighted = FALSE, diss = diss, k = 12, grp.meth="first")
# Arrange sequences by a user specified sorting variable:
# time spent in parental home; has ties
parentTime <- seqistatd(biofam.seq)[, 1]
b.srf <- seqrf(biofam.seq, diss=diss, k=12, sortv=parentTime)
# ... with ggseqrfplot (and some extra annotation using patchwork)
ggseqrfplot(seqrfobject = b.srf) +
plot_annotation(title = "Sorted by time spent in parental home",
theme = theme(plot.title = element_text(hjust = 0.5, size = 18)))