SardiniaHotels {VGAMdata}R Documentation

Data from hotels in Sardinia, Italy

Description

This data set contains information and satisfaction scores appearing on the TripAdvisor website between the years 2008 and 2016 regarding hotels in Sardinia, Italy.

The satisfaction data refer to the reputation of hotel located along Sardinian coasts, as expressed by clients with respect to different services (e.g., breakfast, restaurant, swimming pool) offered by the hotel.

Usage

data(SardiniaHotels)

Format

A data frame with 518 rows and 43 columns (variables). Each row refers to a single hotel.

The following variables are included in the dataset:

municipality

a factor, the municipality where the hotel is located.

stars

an ordered factor with levels:

1OR2stars for 1 star or 2 star hotels,

3stars 3 star hotels,

residence,

4stars, 4 star hotels,

5starsORresort, 5 star hotels or resorts.

area

a factor with levels related to the area of the Sardinian coast where each single hotel is located:

AlgheroSassari, CagliariVillasimius, CostaSmeralda, DorgaliOrosei, Gallura, NurraAnglona, Ogliastra, Olbia, OristanoBosa, PulaChia, Sarrabus, Sulcis.

seaLocation

a factor with levels yes (if the hotel is located close to the sea) and no (otherwise).

excellent

a numeric vector, the number of people that expressed the highest level of satisfaction.

good

a numeric vector, the number of people that expressed a good level of satisfaction.

average

a numeric vector, the number of people that expressed an average level of satisfaction.

bad

a numeric vector, the number of people that expressed a bad level of satisfaction.

poor

a numeric vector, the number of people that expressed the lowest level of satisfaction.

family

a numeric vector, the number of people travelling with family.

couple

a numeric vector, the number of people travelling with their partner.

single

a numeric vector, the number of people travelling alone.

business

a numeric vector, the number of people travelling for work.

MarMay

a numeric vector, the number of people travelling during the period March to May.

JunAug

a numeric vector, the number of people travelling during the period June to August.

SepNov

a numeric vector, the number of people travelling during the period September to November.

DecFeb

a numeric vector, the number of people travelling during the period December to February.

location

a numeric vector, the satisfaction score expressed by tourists towards the location.

sleepQuality

a numeric vector, the satisfaction score expressed by tourists towards the sleep quality.

room

a numeric vector, the satisfaction score expressed by tourists towards the comfort and quality of the room.

services

a numeric vector, the satisfaction score expressed by tourists towards the quality of the services.

priceQualityRate

a numeric vector, the satisfaction score expressed by tourists towards ratio between price and quality.

cleaning

a numeric vector, the satisfaction score expressed by tourists towards level of room and hotel cleaning.

bt1

a factor with levels breakfast, cleaning, location, overall, price, restaurant, room, services, staff, structure and Wi-Fi.

It expresses the 1st most used word in reviews for a hotel.

ratebt1

a factor with levels -1 (if the satisfaction score espressed in bt1 is prevalently negative) and 1 (if the satisfaction score espressed in bt1 is prevalently positive).

bt2

a factor with levels breakfast, cleaning, location, overall, price, restaurant, room, services, staff, structure and Wi-Fi.

It expresses the 2nd most used word in reviews for a hotel.

ratebt2

a factor with levels -1 (if the satisfaction score espressed in bt2 is prevalently negative) and 1 (if the satisfaction score espressed in bt2 is prevalently positive).

bt3

similar to bt1 and bt2, but with a corresponding different ranking.

bt4

similar to bt1 and bt2, but with a corresponding different ranking.

bt5

similar to bt1 and bt2, but with a corresponding different ranking.

bt6

similar to bt1 and bt2, but with a corresponding different ranking.

bt7

similar to bt1 and bt2, but with a corresponding different ranking.

bt8

similar to bt1 and bt2, but with a corresponding different ranking.

bt9

similar to bt1 and bt2, but with a corresponding different ranking.

bt10

similar to bt1 and bt2, but with a corresponding different ranking.

ratebt3

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

ratebt4

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

ratebt5

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

ratebt6

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

ratebt7

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

ratebt8

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

ratebt9

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

ratebt10

similar to ratebt1 and ratebt2, but with a corresponding different ranking.

Details

These data were manually collected during March–June 2016 by students of the class of "Statistics for Turism" at the University of Cagliari, Italy (Bachelor's degree in Tourism Economics and Managment), under the supervision of Prof. Claudio Conversano and Dr. Giulia Contu.

Many of the variables fall into several natural groups, e.g., [municipality, stars, area, seaLocation]; [excellent, good, average, bad, poor]; [MarMay, JunAug, SepNov, DecFeb]; [family, couple, single, business]; [location,...cleaning]; [bt1,...bt10]; [ratebt1,...ratebt10].

Source

TripAdvisor, https://www.tripadvisor.it/.

Examples

data(SardiniaHotels)
summary(SardiniaHotels)

[Package VGAMdata version 1.1-9 Index]