R: Classify News and Non-News Based on keywords in the URL

not_news {rdomains}

R Documentation

Classify News and Non-News Based on keywords in the URL

Description

Based on a slightly amended version of the regular expression used to classify news, and non-news in: “Exposure to ideologically diverse news and opinion on Facebook” by Bakshy, Messing, and Adamic. Science. 2015.

Usage

not_news(url_list = NULL)

Arguments

url_list

vector of URLs

Details

Amendment: sport rather than sports

Note that it is based on patterns existing in a small set of domains. See paper for details.

Value

data.frame with 3 columns: url, not_news, news

References

https://www.science.org/doi/10.1126/science.aaa1160

Examples

## Not run: 
not_news("http://www.bbc.com/sport")
not_news(c("http://www.bbc.com/sport", "http://www.washingtontimes.com/news/politics/"))

## End(Not run)

[Package rdomains version 0.2.1 Index]