freqs_list {lares} | R Documentation |
Frequencies on Lists and UpSet Plot
Description
Visualize frequency of elements on a list, list vector, or vector with comma separated values. Detect which combinations and elements are the most frequent and how much they represent of your total observations. This is similar to the UpSet Plots which may be used as an alternative to Venn diagrams.
Usage
freqs_list(
df,
var = NULL,
wt = NULL,
fx = "mean",
rm.na = FALSE,
min_elements = 1,
limit = 10,
limit_x = NA,
limit_y = NA,
tail = TRUE,
size = 10,
unique = TRUE,
abc = FALSE,
title = "",
plot = TRUE
)
Arguments
df |
Data.frame |
var |
Variable. Variables you wish to process. |
wt |
Variable, numeric. Select a numeric column to use in the colour scale, used as sum, mean... of those values for each of the combinations. |
fx |
Character. Set operation: mean, sum |
rm.na |
Boolean. Remove NA value from |
min_elements |
Integer. Exclude combinations with less than n elements |
limit , limit_x , limit_y |
Integer. Show top n combinations (x) and/or
elements (y). The rest will be grouped into a single element.
Set argument to 0 to ignore. |
tail |
Boolean. Show tail grouped into "..." on the plots? |
size |
Numeric. Text base size |
unique |
Boolean. a,b = b,a? |
abc |
Boolean. Do you wish to sort by alphabetical order? |
title |
Character. Overwrite plot's title with. |
plot |
Boolean. Plot viz? Will be generated anyways in the output object |
Value
List. data.frame with the data results, elements and combinations.
See Also
Other Frequency:
freqs_df()
,
freqs_plot()
,
freqs()
Other Exploratory:
corr_cross()
,
corr_var()
,
crosstab()
,
df_str()
,
distr()
,
freqs_df()
,
freqs_plot()
,
freqs()
,
lasso_vars()
,
missingness()
,
plot_cats()
,
plot_df()
,
plot_nums()
,
tree_var()
Other Visualization:
distr()
,
freqs_df()
,
freqs_plot()
,
freqs()
,
noPlot()
,
plot_chord()
,
plot_survey()
,
plot_timeline()
,
tree_var()
Examples
## Not run:
df <- dplyr::starwars
head(df[, c(1, 4, 5, 12)], 10)
# Characters per movies combinations in a list column
head(df$films, 2)
freqs_list(df, films)
# Skin colours in a comma-separated column
head(df$skin_color)
x <- freqs_list(df, skin_color, min_elements = 2, limit = 5, plot = FALSE)
# Inside "x" we'll have:
names(x)
# Using the 'wt' argument to add a continuous value metric
# into an already one-hot encoded columns dataset (and hide tail)
csv <- "https://raw.githubusercontent.com/hms-dbmi/UpSetR/master/inst/extdata/movies.csv"
movies <- read.csv(csv, sep = ";")
head(movies)
freqs_list(movies,
wt = AvgRating, min_elements = 2, tail = FALSE,
title = "Movies\nMixed Genres\nRanking"
)
# So, please: no more Comedy+SciFi and more Drama+Horror films (based on ~50 movies)!
## End(Not run)