parse_chat {WhatsR}R Documentation

Parsing exported 'WhatsApp' chat logs as a dataframe

Description

Creates a data frame from an exported 'WhatsApp' chat log containing one row per message. Some columns are saved as lists using the I() function so that multiple elements can be stored per message while still maintaining the general structure of one row per message. These columns should be treated as lists or unlisted first.

Usage

parse_chat(
  path,
  os = "auto",
  language = "auto",
  anonymize = "add",
  consent = NA,
  emoji_dictionary = "internal",
  smilie_dictionary = "wikipedia",
  rpnl = " start_newline ",
  verbose = FALSE
)

Arguments

path

Character string containing the file path to the exported 'WhatsApp' chat log as a .txt file.

os

Operating system of the phone the chat was exported from. Default "auto" tries to automatically detect the OS. Also supports "android" or "iOS".

language

Indicates the language setting of the phone with which the messages were exported. Default is "auto" trying to match either 'English' or 'German'. More languages might be supported in the future.

anonymize

TRUE results in the vector of sender names being anonymized and columns containing personal identifiable information to be deleted or restricted, FALSE displays the actual names and all content, "add" adds anonomized columns to the full info columns. Do not blindly trust this and always double check.

consent

String containing a consent message. All messages from chatters who have not posted this *exact* message into the chat will be deleted. Default is NA, no deleting anything.

emoji_dictionary

Dictionary for emoji matching. Can use a version included in this package when set to "internal" or an updated data frame created by download_emoji passed as a character string containing the path to the file.

smilie_dictionary

Value "emoticons" uses ex_emoticon to extract smilies, "wikipedia" uses a more inclusive custom list of smilies containing all mentions from https://de.wiktionary.org/w/index.php?title=Verzeichnis:International/Smileys and manually added ones.

rpnl

Replace newline. A character string for replacing line breaks within messages for the parsed message for better readability. Default is " start_newline ".

verbose

Prints progress messages for parse_chat() to the console if TRUE, default is FALSE.

Value

A dataframe containing one row per message and 11,15, or 19 columns, depending on the setting of the anonymize parameter

Examples

data <- parse_chat(system.file("englishandroid24h.txt", package = "WhatsR"))

[Package WhatsR version 1.0.4 Index]