draft_report {saros}R Documentation

Automatically Draft a Quarto Report

Description

The draft_report() function is the main function, and the only necessary user interface, to create semi-automated (draft) reports. It does not need to be the first step, however, as one might want to store and read in arguments for the function with the read_yaml_params()-function first. After the report files has been drafted with draft_report(), you can edit, render, and ultimately publish these as usual with Quarto features in RStudio. The index.qmd will be the main output file containing "includes" to other chapters.

Usage

draft_report(
  data,
  chapter_overview = NULL,
  ...,
  path,
  title = "Report",
  authors = NULL,
  mesos_report = FALSE,
  mesos_var = NULL,
  label_separator = " - ",
  name_separator = NULL,
  index_yaml_file = NULL,
  report_yaml_file = NULL,
  chapter_yaml_file = NULL,
  qmd_start_section_filepath = NULL,
  qmd_end_section_filepath = NULL,
  index_filename = "index.qmd",
  element_names = c("uni_cat_prop_plot", "uni_cat_freq_plot", "uni_cat_table",
    "uni_chr_table", "hline", "bi_catcat_prop_plot", "bi_catcat_freq_plot",
    "bi_catcat_prop_plot2", "bi_catcat_freq_plot2", "bi_catcat_table", "bi_sigtest"),
  sort_by = ".upper",
  data_label = saros::get_data_label_opts(),
  always_show_bi_for_indep = NULL,
  categories_treated_as_na = NULL,
  variables_always_at_top = NULL,
  variables_always_at_bottom = NULL,
  return_raw = TRUE,
  showNA = c("never", "always", "ifany"),
  totals = FALSE,
  hide_bi_entry_if_sig_above = 1,
  hide_test_if_n_below = 10,
  hide_result_if_n_below = 10,
  hide_chr_for_others = TRUE,
  hide_variable_if_all_na = TRUE,
  single_y_bivariates_if_indep_cats_above = 3,
  single_y_bivariates_if_deps_above = 20,
  digits = 1,
  data_label_decimal_symbol = ".",
  hide_label_if_prop_below = 0.01,
  hide_axis_text_if_single_variable = FALSE,
  main_font_size = 10,
  label_font_size = 3,
  strip_font_size = 7,
  legend_font_size = 7,
  strip_width = 15,
  strip_angle = 0,
  x_axis_label_width = 20,
  plot_height_multiplier_per_horizontal_line = NA,
  plot_height_multiplier_per_vertical_letter = 0.2,
  plot_height_multiplier_per_facet = 0.95,
  plot_height_multiplier_per_legend_line = 1.1,
  plot_height_fixed_constant = 0,
  plot_height_max = 8,
  plot_height_min = 1.5,
  vertical_height = 12,
  vertical = FALSE,
  png_scale = 1.2,
  png_width = 14,
  png_height = 16,
  font_family = "sans",
  colour_palette_nominal = NULL,
  colour_palette_ordinal = NULL,
  colour_na = "gray90",
  colour_2nd_binary_cat = NULL,
  table_main_question_as_header = FALSE,
  max_width_obj = 128,
  max_width_file = 64,
  max_clean_folder_name = 12,
  open_after_drafting = FALSE,
  organize_by = c("chapter", ".variable_label_prefix_dep", ".variable_name_indep",
    ".element_name"),
  arrange_output_by = c("chapter", ".variable_name_dep", ".variable_name_indep"),
  ignore_heading_for_group = c(".element_name", ".variable_type_dep", "chapter"),
  replace_heading_for_group = c(.variable_label_suffix_dep = ".variable_name_dep"),
  mesos_first = TRUE,
  descend = TRUE,
  require_common_categories = TRUE,
  panel_tabset_mesos = TRUE,
  pdf = TRUE,
  attach_chapter_dataset = TRUE,
  auxiliary_variables = NULL,
  flexi = FALSE,
  micro = FALSE,
  reps = 1000,
  information = c(".variable_label_dep", ".category", ".count", ".count_se",
    ".proportion", ".proportion_se", ".mean", ".mean_se", ".data_label",
    ".comb_categories", ".sum_value"),
  contents = c("intro", "not_used_category", "mode_max", "value_max", "value_min",
    "value_diff", "mean_max", "mean_min", "mean_diff", "median_max", "median_min",
    "median_diff", "variance_max", "variance_min"),
  include_numbers = TRUE,
  n_top_bottom = 1,
  log_file = NULL,
  serialized_format = c("rds", "qs"),
  tabular_format = c("delim", "xlsx", "csv", "csv2", "tsv", "sav", "dta"),
  translations = list(last_sep = " and ", download_report = "Download report (PDF)",
    intro_prefix = "We will now look at the questions asked regarding ", intro_suffix =
    "", mode_max_onfix = " on ", mode_max_prefix = "The most common responses were ",
    mode_max_suffix = "", not_used_prefix =
    "The following response categories were not used: ", not_used_suffix = "",
    value_max_prefix = "", value_max_infix =
    " {?is/are} the {dots$n_top_bottom} item{?s} where the most responded ",
    value_max_suffix = "", value_min_prefix = "", 
     value_min_infix =
    " {?is/are} the {dots$n_top_bottom} item{?s} where the fewest responded ",
    value_min_suffix = "", mean_onfix = "M = ", mean_max_prefix =
    "They have highest mean on ", mean_max_suffix = "", mean_min_prefix =
    "They have lowest mean on ", mean_min_suffix = "", median_onfix = "Median = ",
    median_max_prefix = "They have highest median on ", median_max_suffix = "",
    median_min_prefix = "They have lowest median on ", median_min_suffix = "",
    intro_by_prefix = "We will now look at the questions asked regarding ", 
    
    intro_by_infix = " broken down by ", intro_by_suffix = "", by_breakdown = " by ",
    n_equal_prefix = " (N = ", n_equal_suffix = ")", table_heading_N =
    "Total (N)", by_total = "Everyone", sigtest_prefix = "Significance testing of ",
    sigtest_suffix = "", mesos_group_prefix = " Group: ", mesos_group_suffix = "",
    mesos_label_all_others = "Others", empty_chunk_text = "\nText\n",
    flexi_input_chapter = "Chapter(s):", flexi_input_dep = "Dependent variable(s):",
    flexi_input_indep = "Independent variable:", 
     flexi_input_mesos_group =
    "Filter:", flexi_figure_type = "Figure type:", flexi_data_label =
    "Summary to display", flexi_showNA = "Show NA (Missing)", flexi_sort_by = "Sort by",
    flexi_totals = "Totals", flexi_digits = "Digits after decimal", flexi_table =
    "Table", flexi_figure = "Figure", flexi_cols_variable_name = "Variable name",
    flexi_cols_variable_label = "Variable label", flexi_cols_category =
    "Response category", flexi_cols_count = "N", flexi_cols_count_se = "SE(N)",
    flexi_cols_proportion = "Proportion", 
     flexi_cols_proportion_se =
    "SE(Proportion)", flexi_cols_mean = "Mean", flexi_cols_mean_se = "SE(Mean)",
    flexi_cols_data_label = "Data label", flexi_cols_comb_categories =
    "Combined categories", flexi_cols_sum_value =
    "Sum of data label across combined categories", flexi_validate =
    "Error: Columns must have some categories in common.", flexi_settings = "Settings",
    flexi_basic_settings = "Basic", flexi_advanced_settings = "Advanced",
    flexi_input_indep_none = "<none>", flexi_figure_type_proportion = "Proportion", 
    
    flexi_figure_type_frequency = "Frequency", flexi_hide_label_if_prop_below =
    "Hide label if proportion below:")
)

Arguments

data

Survey data

⁠obj:<data.frame>|obj:<tbl_df>⁠ // Required

A data frame (or a srvyr-object) with the columns specified in the chapter_overview 'dep_cat', etc columns.

chapter_overview

What goes in each chapter

⁠obj:<data.frame>|obj:<tbl_df>⁠ // Required

Data frame (or tibble, possibly grouped). One row per chapter. Should contain the columns 'chapter' and 'dep', Optionally 'indep' (independent variables) and other informative columns as needed.

...

Dynamic dots

<dynamic-dots>

Arguments forwarded to the corresponding functions that create the elements.

path

Output path

⁠scalar<character>⁠ // default: NULL (optional)

Path to save all output.

title

Title of report

⁠scalar<character>⁠ // default: NULL (optional)

Added automatically to YAML-header of index.qmd-file.

authors

Authors of entire report

⁠vector<character>⁠ // default: NULL (optional)

If NULL, infers from chapter_overview$authors, and collates for entire report.

mesos_report

Whether to produce reports per mesos group

⁠scalar<logical>⁠ // default: FALSE If false, returns a regular single report.

mesos_var

Variable in ´data´ indicating groups to tailor reports for

⁠scalar<character>⁠ // default: NULL (optional)

Column name in data indicating the groups for which mesos reports will be produced.

label_separator

Variable label separator

⁠scalar<character>⁠ // default: NULL (optional)

String to split labels on main question and sub-items.

name_separator

Variable name separator

⁠scalar<character>⁠ // default: NULL (optional)

String to split column names in data between main question and sub-items

index_yaml_file, report_yaml_file

Path to YAML-file to insert into index.qmd and report.qmd respectively

⁠scalar<character>⁠ // default: NULL (optional)

Path to file used to insert header YAML, in index and report files.

chapter_yaml_file

Path to YAML-file to insert into each chapter qmd-file

⁠scalar<character>⁠ // default: NULL (optional)

Path to file used to insert header YAML, in each chapter.

qmd_start_section_filepath, qmd_end_section_filepath

Path to qmd-bit for start/end of each qmd

⁠scalar<character>⁠ // default: NULL (optional)

Path to qmd-snippet placed before/after body of all chapter qmds.

index_filename

Index filename

⁠scalar<character>⁠ // default: "index.qmd" (optional)

The name of the main index Quarto file (and its subfolder) used as landing page for each report. Will link to a PDF (report.qmd) which collects all chapters.

element_names

Elements to be reported

⁠vector<character>⁠ // default: NULL (optional)

Elements to be reported for all sets (batteries) of y-variables.

sort_by

What to sort output by

⁠vector<character>⁠ // default: NULL (optional)

Sort output (and collapse if requested).

".top"

The proportion for the highest category available in the variable.

".upper"

The sum of the proportions for the categories above the middle category.

".mid_upper"

The sum of the proportions for the categories including and above the middle category.

".mid_lower"

The sum of the proportions for the categories including and below the middle category.

".lower"

The sum of the proportions for the categories below the middle category.

".bottom"

The proportions for the lowest category available in the variable.

".variable_label"

Sort by the variable labels.

".id"

Sort by the variable names.

".by_group"

The groups of the by argument.

character()

Character vector of category labels to sum together.

data_label

Data label

⁠scalar<character>⁠ // default: "proportion" (optional)

One of "proportion", "percentage", "percentage_bare", "count", "mean", or "median".

always_show_bi_for_indep

Always show bivariate for indep-variable

⁠vector<character>⁠ // default: NULL (optional)

Specific combinations with a by-variable where bivariates should always be shown.

categories_treated_as_na

NA categories

⁠vector<character>⁠ // default: NULL (optional)

Categories that should be treated as NA.

variables_always_at_top, variables_always_at_bottom

Top/bottom variables

⁠vector<character>⁠ // default: NULL (optional)

Column names in data that should always be placed at the top or bottom of figures/tables.

return_raw

NOT IN USE

⁠scalar<integer>⁠ // default: FALSE

Whether to return the raw static element.

showNA

Show/hide NA in categorical variables

⁠scalar<logical>⁠ // default: NULL (optional)

Whether to show NA in categorical variables (one of c("ifany", "always", "never")).

totals

Include totals

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include totals in the output.

hide_bi_entry_if_sig_above

p-value threshold for hiding bivariate entry

⁠scalar<double>⁠ // default: 1 (optional)

Whether to hide bivariate entry if significance is above this value. Defaults to showing all.

hide_test_if_n_below

Threshold n for hiding significance test

⁠scalar<integer>⁠ // default: 0 (optional)

If N is below this value, p-value will not be shown.

hide_result_if_n_below

Hide result if N below

⁠scalar<integer>⁠ // default: 10 (optional)

Whether to hide result if N for a given datasets (or mesos group) is below this value. NOTE: Exceptions will be made to chr_table and chr_plot as these are typically exempted in the first place. This might change in the future with a separate argument.

hide_chr_for_others

Hide open response displays for others

⁠scalar<logical>⁠ // default: TRUE (optional)

For mesos reports using the element "chr_table", open responses are displayed for also the entire sample (FALSE) or only for the mesos group to ensure data privacy (TRUE).

hide_variable_if_all_na

Hide variable from outputs if containing all NA

⁠scalar<boolean>⁠ // default: TRUE (optional)

Whether to remove all variables (in particular useful for mesos) if all values are NA

single_y_bivariates_if_indep_cats_above

Single y bivariates if indep-cats above ...

⁠scalar<integer>⁠ // default: 3 (optional)

Figures and tables for bivariates can become very long if the independent variable has many categories. This argument specifies the number of indep categories above which only single y bivariates should be shown.

single_y_bivariates_if_deps_above

Single y bivariates if dep-vars above ...

⁠scalar<integer>⁠ // default: 20 (optional)

Figures and tables for bivariates can become very long if there are many dependent variables in a battery/question matrix. This argument specifies the number of dep variables above which only single y bivariates should be shown. Set to 0 to always show single y bivariates.

digits

Decimal places

⁠scalar<integer>⁠ // default: 0L (optional)

Number of decimal places.

data_label_decimal_symbol

Decimal symbol

⁠scalar<character>⁠ // default: "." (optional)

Decimal marker, some might prefer a comma ',' or something else entirely. NOTE: Future version will likely postpone formatting this until gt(), kable(), etc.

hide_label_if_prop_below

Hide label threshold

⁠scalar<numeric>⁠ // default: NULL (optional)

Whether to hide label if below this value. NOTE: Future versions will likely distinguish between element_types.

hide_axis_text_if_single_variable

Hide y-axis text if just a single variable

⁠scalar<boolean>⁠ // default: FALSE (optional)

Whether to hide text on the y-axis label if just a single variable

main_font_size, label_font_size, strip_font_size, legend_font_size

Font sizes

⁠scalar<integer>⁠ // default: 12 (optional)

Font sizes for general text (10), data label text (3), strip text (7) and legend text (7).

strip_angle

Angle on the facet strip in plots

⁠scalar<double>⁠ // default: 0

x_axis_label_width, strip_width

Label width of x-axis and strip texts in plots

⁠scalar<integer>⁠ // default: 20 (optional)

Width of the labels used for the categorical column names in x-axis texts and strip texts.

plot_height_multiplier_per_vertical_letter, plot_height_multiplier_per_horizontal_line

Height multiplier

⁠scalar<double>⁠ // default: .1

Height in cm per chart entry, for all static plots.

plot_height_multiplier_per_facet

Plot height multiplier per facet

⁠scalar<double>⁠ // default: 0.95 (optional)

Multiplier for plot height per facet. Defaults to optimal at .95, i.e. slightly less than no change (1).

plot_height_multiplier_per_legend_line

Plot height multiplier per legend line

⁠scalar<double>⁠ // default: 1.1 (optional)

Multiplier for plot height per horizontal line of legend. Defaults to optimal at 1.1, i.e. slightly more than no change (1).

plot_height_fixed_constant

Height constant addition

⁠scalar<double>⁠ // default: 0

Fixed height in cm to add to all static plots.

plot_height_max

Maximum plot height

⁠scalar<double>⁠ // default: 10 (optional)

Maximum height for the plot.

plot_height_min

Minimum plot height

⁠scalar<double>⁠ // default: 2 (optional)

Minimum height for the plot.

vertical_height

Vertical height

⁠scalar<double>⁠ // default: NULL (optional)

Height for vertical layout of plot? NEEDS CHECKING

vertical

Orientation of plots

⁠scalar<logical>⁠ // default: FALSE (optional)

If FALSE (default), then horizontal plots.

png_scale

PNG scale

⁠scalar<double>⁠ // default: 1 (optional)

Scale factor for PNG output.

png_width, png_height

PNG width and height

⁠scalar<double>⁠ // default: 12 (optional)

Width for PNG output.

font_family

Font family

⁠scalar<character>⁠ // default: "sans" (optional)

Word font family. See officer::fp_text.

colour_palette_nominal, colour_palette_ordinal

Colour palettes (nominal and ordinal)

⁠vector<character>⁠ // default: NULL (optional)

Must contain at least the number of unique values (including missing) in the data set.

colour_na

Colour for NA category

⁠scalar<character>⁠ // default: NULL (optional)

Colour as a single string for NA values.

colour_2nd_binary_cat

Colour for second binary category

⁠scalar<character>⁠ // default: "#ffffff" (optional)

Colour for second category in binary variables. Often useful to hide this.

table_main_question_as_header

Table main question as header

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to include the main question as a header in the table.

max_width_obj

Maximum object width

⁠scalar<integer>⁠ // default: NULL (optional)

Maximum width for object names in the Quarto script. In particular useful when having label as part of the structure.

max_width_file

Maximum filename width

⁠scalar<integer>⁠ // default: NULL (optional)

Maximum width for any filename. Due to OneDrive having a max path of about 400 characters, this can quickly be exceeded with a long path base path, long file names if using labels as part of structure, and hashing with Quarto's cache: true feature. This argument truncates the filenames.

max_clean_folder_name

Maximum clean folder name length

⁠scalar<integer>⁠ // default: NULL (optional)

Whereas max_width_file truncates the file name, this argument truncates the folder name. It will not impact the report or chapter names in website, only the folders.

open_after_drafting

Whether to open index.qmd

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to open the main output file (index.qmd) after completion.

organize_by

Grouping columns

⁠vector<character>⁠ // default: NULL (optional)

Column names used for identifying chapters and sections.

arrange_output_by

Grouping columns

⁠vector<character>⁠ // default: NULL (optional)

Column names used for sorting output within each organize_by group

ignore_heading_for_group

Ignore heading for group

⁠vector<character>⁠ // default: NULL (optional)

Type of refined chapter_overview data for which to suppress the heading in the report output. Typically variable_name_dep, variable_name_indep, etc.

replace_heading_for_group

Replacing heading for group

⁠named vector<character>⁠ // default: c(".variable_label_suffix_dep" = ".variable_name_dep")

Occasionally, one needs to replace the heading with another piece of information in the refined chapter_overview. For instance, one may want to organize output by variable_name_indep, but to display the variable_label_indep instead. Use the name for the replacement and the value for the original.

mesos_first

mesos first

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to place the mesos group element before or after the entire sample.

descend

Sorting order

⁠scalar<logical>⁠ // default: FALSE (optional)

Reverse sorting of sort_by.

require_common_categories

Check common categories

⁠scalar<logical>⁠ // default: NULL (optional)

Whether to check if all items share common categories.

panel_tabset_mesos

mesos panel tabset

⁠scalar<logical>⁠ // default: TRUE (optional)

Whether in mesos reports the comparison group should be displayed as a Quarto panel tabset (TRUE), or above each other (FALSE).

pdf

Create PDF of full report?

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to create a PDF of the entire report (all chapters included in a single file).

attach_chapter_dataset

Toggle inclusion of chapter-specific datasets in qmd-files

⁠scalar<logical>⁠ // default: FALSE

Whether to save in each chapter folder an 'Rds'-file with the chapter-specific dataset, and load it at the top of each QMD-file.

auxiliary_variables

Auxiliary variables to be included in datasets

⁠vector<character>⁠ // default: NULL (optional)

Column names in data that should always be included in datasets for chapter qmd-files, if attach_chapter_dataset=TRUE. Not publicly available.

flexi

Create page with user-editable categorical plots and tables

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to create a folder with a Shiny flexi app containing all the variables in the chapter_overview and auxiliary_variables.

micro

Create page with raw data (micro data) and codebook

⁠scalar<logical>⁠ // default: FALSE (optional)

Whether to a page with local links to a raw dataset (in various formats) and codebook (in various formats).

reps

Number of permutations

⁠scalar<integer>⁠ // default: 100 (optional)

Number of permutations to be performed in bootstrap significance tests.

information

Pre-computed information

⁠scalar<character>⁠ // default: NULL (optional)

Which pre-computed information for each variable-category to display.

contents

Text interpretations

⁠vector<character>⁠ // default: all available (optional)

The type of text interpretations to return.

include_numbers

Include numbers

⁠scalar<logical>⁠ // default: NULL (optional)

Whether or not to include the actual numbers in parentheses.

n_top_bottom

Top and bottom entries to report

⁠scalar<integer>⁠ // default: NULL (optional)

The number of top and bottom entries to report.

log_file

Path to log file

⁠scalar<string>⁠ // default: "_log.txt" (optional)

Path to log file. Set to NULL to disable logging.

serialized_format

Serialized format

⁠scalar<string>⁠ // default: "rds"

Format for serialized data. One of "rds" (default), "qs" or "fst". The latter two requires the respective packages to be installed. qs is usually the fastest and most space efficient, but sets package dependencies on the report.

tabular_format

Serialized format

⁠scalar<string>⁠ // default: "delim"

Format for pretty tabular data, meant for end-user to peruse and will be linked to in reports (the graph data, etc). One of "delim" (tab-separated delim-files) "xlsx" requires writexl-package), "csv" or "csv2" (requires readr-package. "dta" or "sav" requires haven-package. Currently must be specified, in the future this will become an optional argument.

translations

Translations

list // default: saros:::.saros.env$defaults$translations (optional)

Named list of strings for translations.

Details

This function requires at a minimum a dataset (data frame and tibbles are supported so far). Note that saros treats data as they are stored: numeric, integer, factor, ordinal, character, and datetime. Currently, only factor/ordinal and character are implemented. Second, the chapter_overview must be specified, also as a (small) data frame, with at least the character columns 'chapter' and 'dep', where the first names the output chapters, and the 'dep'-column contain comma-separated (alternatively using tidyselect-syntax) columns in the data which are to be treated as dependent variables. See chapter_overview for more options.

Value

Path to index qmd-file. If not specified in the yaml_path file, will default to index.qmd.

Examples


index_filepath <-
 draft_report(
    chapter_overview = ex_survey_ch_overview,
    data = ex_survey,
    path = tempdir())
index_filepaths <-
  draft_report(
   chapter_overview = ex_survey_ch_overview,
   data = ex_survey,
   mesos_report = TRUE,
   mesos_var = "f_uni",
   path = tempdir())


[Package saros version 1.0.4 Index]