frequency_table {dataframeexplorer}R Documentation

Generate frequency of each entry in each column of dataframe

Description

Real-life data is seldom perfect and fields in a data.frame contains entries not anticipated by the data scientist. This function helps to know your data entries before performing any manipulations on it. This function generates frequency table excel, each column of input dataframe in a separate sheet in output excel file. Warning: An excel sheet can support 2^20 rows of data only (approx. 1 million). If the number of unique entries in a column exceeds that, excel will drop the low frequency entries.

Usage

frequency_table(
  dataset,
  output_filename = "",
  maximum_entries = 2^20,
  format_width = TRUE,
  sl_no_required = TRUE,
  frequency_required = TRUE,
  percentage_required = TRUE,
  cumulative_percentage_required = FALSE,
  string_length_required = TRUE
)

Arguments

dataset

A data.frame

output_filename

Name of the output text file (should end in ".xlsx") Strongly advised to pass this parameter, else the function's default is "frequency_table_<system_time>.xlsx"

maximum_entries

Maximum unique entries in output. For e.g. setting this parameter to 10000 will return only top 10000 occurring entries in each column

format_width

Boolean input indicating if output excel cells' column width need to be formatted to "auto"

sl_no_required

Boolean input indicating if Sl_No column needs to be present in output excel

frequency_required

Boolean input indicating if Frequency column needs to be present in output excel

percentage_required

Boolean input indicating if Percentage column needs to be present in output excel

cumulative_percentage_required

Boolean input indicating if Cumulative_Percentage column needs to be present in output excel

string_length_required

Boolean input indicating if String_Length column needs to be present in output excel

Value

Does not return to calling function, writes to file system rather

Examples

## Not run: 
frequency_table(dataset = iris, output_filename = "frequency_table_iris.xlsx")
frequency_table(dataset = mtcars, output_filename = "C/Users/Desktop/frequency_table_mtcars.xlsx")

## End(Not run)

[Package dataframeexplorer version 1.0.2 Index]