flights {graphicalExtremes}R Documentation

Flights delay data

Description

A dataset containing daily total delays of major U.S. airlines. The raw data was obtained from the U.S. Bureau of Transportation Statistics, and pre-processed as described in Hentschel et al. (2022). Note: The CRAN version of this package contains only data from 2010-2013. The full dataset is available in the GitHub version of this package.

Usage

flights

Format

A named list with three entries:

airports

A data.frame, containing information about US airports

delays

A numeric matrix, containing daily aggregated delays at the airports in the dataset

flightCounts

A numeric array, containing yearly flight numbers between airports in the dataset

Details

flightCounts is a three-dimensional array, containing the number of flights in the dataset between each pair of airports, aggregated on a yearly basis. Each entry is the total number of flights between the departure airport (row) and destination airport (column) in a given year (dimension 3). This array does not contain any NAs, even if an airport did not operate at all in a given year, which is simply indicated by zeros.

delays is a three-dimensional array containing daily total positive delays, in minutes, of incoming and outgoing flights respectively. Each column corresponds to an airport in the dataset and each row corresponds to a day. The third dimension has length two, 'arrivals' containing delays of incoming flights and 'departures' containing delays of outgoing flights. Zeros indicate that there were flights arriving/departing at that airport on a given day, but none of them had delays. NAs indicate that there were no flights arriving/departing at that airport on that day at all.

airports is a data frame containing the following information about a number of US airports. Some entries are missing, which is indicated by NAs.

IATA

3-letter IATA code

Name

name of the airport

City

main city served by the airport

Country

country or territory where the airport is located (mostly "United States")

ICAO

4-letter ICAO code

Latitude

latitude of the airport, in decimal degrees

Longitude

longitude of the airport, in decimal degrees

Altitude

altitude of the airport, in feet

Timezone

timezone of the airport, in hours offset from UTC

DST

Daylight savings time used at the airport. 'A'=US/Canada, 'N'=None.

Timezone2

name of the timezone of the airport

Source

Raw delays data:

Fields/Forms used in the raw data:

Airports (includes license information):

References

Hentschel M, Engelke S, Segers J (2022). “Statistical Inference for Hüsler-Reiss Graphical Models Through Matrix Completions.” doi:10.48550/ARXIV.2210.14292, https://arxiv.org/abs/2210.14292.

See Also

Other flight data related topics: flightCountMatrixToConnectionList(), getFlightDelayData(), getFlightGraph(), plotFlights()

Other datasets: danube

Examples

# Get total number of flights in the dataset:
totalFlightCounts <- apply(flights$flightCounts, c(1,2), sum)

# Get number of flights for specific years in the dataset:
flightCounts_10_11 <- apply(flights$flightCounts[,,c('2010', '2011')], c(1,2), sum)

# Get list of connections from 2008:
connections_10 <- flightCountMatrixToConnectionList(flights$flightCounts[,,'2010'])


[Package graphicalExtremes version 0.3.2 Index]