cocktails {cocktailApp} | R Documentation |
Cocktails Data
Description
Ingredients of over 26 thousand cocktails, scraped from the web.
Usage
data(cocktails)
Format
A data.frame
object with around 117,000 rows and 12 columns. The
data were scraped from four websites: Difford's guide, Webtender, and
Kindred Cocktails, all scraped in late 2017; and Drinks Mixer, scraped in
mid 2018.
The columns are defined as follows:
amt
The numeric amount of the ingredient.
unit
The unit corresponding to the amount. The most common entry is
fl oz
, which is the unit for ‘main’ ingredients. The second most common entry isgarnish
. These two units account for over 95 percent of the rows of the data.ingredient
The name of the ingredient. These may have odd qualifiers, or brand specifications. Some of these qualifications are stripped out in the
short_ingredient
field.cocktail
The name of the cocktail.
rating
The rating assigned to the cocktail in the upstream database. For some sources, the ratings have been rescaled. Ratings are on a scale of 0 to 5.
upstream_id
An ID code from the upstream source.
url
The upstream URL.
votes
The number of votes in the rating, from the upstream database. Not always available.
added
The date the cocktail was added to the upstream database. Not always available.
src
The source of the cocktail, as listed in the upstream database. Usually not available.
short_ingredient
A shortened form of the ingredient, stripping away some of the qualifiers. This is subject to change in future releases of this package, when a better term extraction solution is found.
proportion
For ingredients where the
unit
isfl oz
, this is the proportion of the given cocktail that consists of the given ingredient. For a given cocktail, the proportions should sum to one.
Note
The data were scraped from several websites, which falls in a legal gray area. While, in general, raw factual data can not be copyright, there is a difference between the law and a lawsuit. The package author in no way claims any copyright on this data.
Source
Difford's Guide, https://www.diffordsguide.com/, Webtender, https://www.webtender.com, Kindred Cocktails, https://kindredcocktails.com, Drinks Mixer, http://www.drinksmixer.com.
Examples
data(cocktails)
str(cocktails)
require(dplyr)
cocktails %>%
filter(short_ingredient %in% c('Averna','Bourbon')) %>%
group_by(cocktail,url) %>%
mutate(isok=n() > 1) %>%
ungroup() %>%
filter(isok) %>%
arrange(desc(rating),cocktail) %>%
select(cocktail,ingredient,amt,unit,rating) %>%
head(n=8)