geotag_tweets {epitweetr}R Documentation

Launches the geo-tagging loop

Description

This function will geolocate all tweets before the current hour that have not been already geolocated

Usage

geotag_tweets(tasks = get_tasks())

Arguments

tasks

Tasks object for reporting progress and error messages, default: get_tasks()

Details

It geolocates tweets by collection date, and stores the result in the tweets/geolocated folder. It starts from the last geolocated date until the last collected tweet. When running on a day that has been partially geolocated, it will ignore tweets that have already been processed.

The geolocation is applied to several fields of tweets: text, original text (if retweet or quote), user description, user declared location, user biography, API location. For each field it will perform the following steps:

This algorithm has mainly been developed in Spark.

A prerequisite to this function is that the search_loop must already have stored collected tweets in the search folder and that the tasks download_dependencies, update_geonames and update_languages have successfully been run. Normally this function is not called directly by the user but from the detect_loop function.

Value

The list of tasks updated with produced messages

See Also

download_dependencies

update_geonames

update_languages

detect_loop

aggregate_tweets

get_tasks

Examples

if(FALSE){
   library(epitweetr)
   # setting up the data folder
   message('Please choose the epitweetr data directory')
   setup_config(file.choose())

   # geolocating last tweets
   tasks <- geotag_tweets()
}

[Package epitweetr version 0.1.28 Index]