preprocess_tweets {CooRTweet} | R Documentation |
preprocess_tweets
Description
Reformat nested Twitter data (retrieved from Twitter V2 API).
Spreads out columns and reformats nested a data.table
to
a named list of unnested data.tables.
All output is in long-format.
Usage
preprocess_tweets(
tweets,
tweets_cols = c("possibly_sensitive", "lang", "text", "public_metrics_retweet_count",
"public_metrics_reply_count", "public_metrics_like_count",
"public_metrics_quote_count")
)
Arguments
tweets |
a data.table to unnest. Twitter data loaded with load_tweets_json'. |
tweets_cols |
a character vector specifying the columns to keep (optional). |
Details
Restructure your nested Twitter data that you loaded with
load_tweets_json. The function unnests the following columns:
public_metrics
(likes, retweets, quotes),
referenced_tweets
(IDs of "replied to" and "retweet"),
entities
(hashtags, URLs, other accounts).
Returns a named list with several data.tables
,
each data.table
represents one aspect of the nested data.
The function also expects that the following additional
columns are present in the data.table
:
created_at
, tweet_id
, author_id
,
conversation_id
, text
,
in_reply_to_user_id
.
Implicitely dropped columns: edit_history_tweet_ids
Value
a named list
with 5 data.tables:
tweets (contains all tweets and their meta-data),
referenced (information on referenced tweets),
urls (all urls mentioned in tweets),
mentions (other accounts mentioned in tweets),
hashtags (hashtags mentioned in tweets)