| preprocess_tweets {CooRTweet} | R Documentation |
preprocess_tweets
Description
Reformat nested Twitter data (retrieved from Twitter V2 API).
Spreads out columns and reformats nested a data.table to
a named list of unnested data.tables.
All output is in long-format.
Usage
preprocess_tweets(
tweets,
tweets_cols = c("possibly_sensitive", "lang", "text", "public_metrics_retweet_count",
"public_metrics_reply_count", "public_metrics_like_count",
"public_metrics_quote_count")
)
Arguments
tweets |
a data.table to unnest. Twitter data loaded with load_tweets_json'. |
tweets_cols |
a character vector specifying the columns to keep (optional). |
Details
Restructure your nested Twitter data that you loaded with
load_tweets_json. The function unnests the following columns:
public_metrics (likes, retweets, quotes),
referenced_tweets (IDs of "replied to" and "retweet"),
entities (hashtags, URLs, other accounts).
Returns a named list with several data.tables,
each data.table represents one aspect of the nested data.
The function also expects that the following additional
columns are present in the data.table:
created_at, tweet_id, author_id,
conversation_id, text,
in_reply_to_user_id.
Implicitely dropped columns: edit_history_tweet_ids
Value
a named list with 5 data.tables:
tweets (contains all tweets and their meta-data),
referenced (information on referenced tweets),
urls (all urls mentioned in tweets),
mentions (other accounts mentioned in tweets),
hashtags (hashtags mentioned in tweets)