R: preprocess

preprocess_tweets {CooRTweet}

R Documentation

preprocess_tweets

Description

Reformat nested Twitter data (retrieved from Twitter V2 API). Spreads out columns and reformats nested a data.table to a named list of unnested data.tables. All output is in long-format.

Usage

preprocess_tweets(
  tweets,
  tweets_cols = c("possibly_sensitive", "lang", "text", "public_metrics_retweet_count",
    "public_metrics_reply_count", "public_metrics_like_count",
    "public_metrics_quote_count")
)

Arguments

`tweets`	a data.table to unnest. Twitter data loaded with load_tweets_json'.
`tweets_cols`	a character vector specifying the columns to keep (optional).

Details

Restructure your nested Twitter data that you loaded with load_tweets_json. The function unnests the following columns: public_metrics (likes, retweets, quotes), referenced_tweets (IDs of "replied to" and "retweet"), entities (hashtags, URLs, other accounts). Returns a named list with several data.tables, each data.table represents one aspect of the nested data. The function also expects that the following additional columns are present in the data.table: created_at, tweet_id, author_id, conversation_id, text, in_reply_to_user_id. Implicitely dropped columns: edit_history_tweet_ids

Value

a named list with 5 data.tables: tweets (contains all tweets and their meta-data), referenced (information on referenced tweets), urls (all urls mentioned in tweets), mentions (other accounts mentioned in tweets), hashtags (hashtags mentioned in tweets)

[Package CooRTweet version 2.0.2 Index]