we8there {textir}R Documentation

On-Line Restaurant Reviews


Counts for 2804 bigrams in 6175 restaurant reviews from the site www.we8there.com.


The short user-submitted reviews are accompanied by a five-star rating on four specific aspects of restaurant quality - food, service, value, and atmosphere - as well as the overall experience. The reviews originally appear in Maua and Cozman (2009), and the parsing details behind these specific counts are in Taddy (MNIR; 2013).



A dgCMatrix of phrase counts indexed by review-rows and bigram-columns.


A matrix containing the associated review ratings.


Matt Taddy, mataddy@gmail.com


Maua, D.D. and Cozman, F.G. (2009), Representing and classifying user reviews. In ENIA '09: VIII Enconro Nacional de Inteligencia Artificial, Brazil.

Taddy (2013, JASA), Multinomial Inverse Regression for Text Analysis.

Taddy (2013, AoAS), Distributed Multinomial Regression.

See Also

dmr, srproj


## some multinomial inverse regression
## we'll regress counts onto 5-star overall rating

## cl=NULL implies a serial run. 
## To use a parallel library fork cluster, 
## uncomment the relevant lines below. 
## Forking is unix only; use PSOCK for windows
cl <- NULL
# cl <- makeCluster(detectCores(), type="FORK")
## small nlambda for a fast example
fits <- dmr(cl, we8thereRatings[,'Overall',drop=FALSE], 
			we8thereCounts, bins=5, gamma=1, nlambda=10)
# stopCluster(cl)

## plot fits for a few individual terms
terms <- c("first date","chicken wing",
			"ate here", "good food",
			"food fabul","terribl servic")
for(j in terms)
{ 	plot(fits[[j]]); mtext(j,font=2,line=2) }
## extract coefficients
B <- coef(fits)
mean(B[2,]==0) # sparsity in loadings
## some big loadings in IR

## do MNIR projection onto factors
z <- srproj(B,we8thereCounts) 

## fit a fwd model to the factors
summary(fwd <- lm(we8thereRatings$Overall ~ z)) 

## truncate the fwd predictions to our known range
fwd$fitted[fwd$fitted<1] <- 1
fwd$fitted[fwd$fitted>5] <- 5
## plot the fitted rating by true rating
plot(fwd$fitted ~ factor(we8thereRatings$Overall), 
	varwidth=TRUE, col="lightslategrey")

[Package textir version 2.0-5 Index]