parseTweets {streamR} | R Documentation |
Converts tweets in JSON format to data frame.
Description
This function parses tweets downloaded using filterStream
,
sampleStream
or userStream
and returns a data frame. If tweet contains
280-character text it will return the complete text and not only 140 characters.
Usage
parseTweets(tweets, simplify = FALSE, verbose = TRUE, legacy = FALSE)
Arguments
tweets |
A character string naming the file where tweets are stored or the name of the object in memory where the tweets were saved as strings. |
simplify |
If |
verbose |
logical, default is |
legacy |
logical, default is |
Details
parseTweets
parses tweets downloaded using the filterStream
,
sampleStream
or userStream
functions
and returns a data frame where each row corresponds to one tweet and each column
represents a different field for each tweet (id, text, created_at, etc.).
The total number of tweets that are parsed might be lower than the number of lines in the file or object that contains the tweets because blank lines, deletion notices, and incomplete tweets are ignored.
To parse json to a twitter list, see readTweets
. That function can be significantly
faster for large files, when only a few fields are required.
Note also that the retweet_count
field contains the number of times a given tweet
was retweeted at the time it was captured from the API, or for automatic retweets the number
of times the original tweet was retweeted.
Author(s)
Pablo Barbera pablo.barbera@nyu.edu
See Also
filterStream
, sampleStream
, userStream
Examples
## The dataset example_tweets contains 10 public statuses published
## by @twitterapi in plain text format. The code below converts the object
## into a data frame that can be manipulated by other functions.
data(example_tweets)
tweets.df <- parseTweets(example_tweets, simplify=TRUE, legacy=TRUE)
## Not run:
## A more complete example, that shows how to capture a user's home timeline
## for one hour using authentication via OAuth, and then parsing the tweets
## into a data frame.
library(ROAuth)
reqURL <- "https://api.twitter.com/oauth/request_token"
accessURL <- "https://api.twitter.com/oauth/access_token"
authURL <- "https://api.twitter.com/oauth/authorize"
consumerKey <- "xxxxxyyyyyzzzzzz"
consumerSecret <- "xxxxxxyyyyyzzzzzzz111111222222"
my_oauth <- OAuthFactory$new(consumerKey=consumerKey,
consumerSecret=consumerSecret,
requestURL=reqURL,
accessURL=accessURL,
authURL=authURL)
my_oauth$handshake()
userStream( file="my_timeline.json", with="followings",
timeout=3600, oauth=my_oauth )
tweets.df <- parseTweets("my_timeline.json")
## End(Not run)