flights {graphicalExtremes} | R Documentation |
Flights delay data
Description
A dataset containing daily total delays of major U.S. airlines. The raw data was obtained from the U.S. Bureau of Transportation Statistics, and pre-processed as described in Hentschel et al. (2022). Note: The CRAN version of this package contains only data from 2010-2013. The full dataset is available in the GitHub version of this package.
Usage
flights
Format
A named list
with three entries:
airports
A
data.frame
, containing information about US airportsdelays
A numeric matrix, containing daily aggregated delays at the airports in the dataset
flightCounts
-
A numeric array, containing yearly flight numbers between airports in the dataset
Details
flightCounts
is a three-dimensional array, containing the number of flights in the dataset
between each pair of airports, aggregated on a yearly basis.
Each entry is the total number of flights between the departure airport (row)
and destination airport (column) in a given year (dimension 3).
This array does not contain any NA
s, even if an airport did not operate
at all in a given year, which is simply indicated by zeros.
delays
is a three-dimensional array containing daily total positive delays,
in minutes, of incoming and outgoing flights respectively.
Each column corresponds to an airport in the dataset and each row corresponds
to a day. The third dimension has length two, 'arrivals'
containing delays of
incoming flights and 'departures'
containing delays of outgoing flights.
Zeros indicate that there were flights arriving/departing at that airport
on a given day, but none of them had delays. NA
s indicate that there were
no flights arriving/departing at that airport on that day at all.
airports
is a data frame containing the following information about a number of US airports.
Some entries are missing, which is indicated by NA
s.
IATA
3-letter IATA code
Name
name of the airport
City
main city served by the airport
Country
country or territory where the airport is located (mostly
"United States"
)ICAO
4-letter ICAO code
Latitude
latitude of the airport, in decimal degrees
Longitude
longitude of the airport, in decimal degrees
Altitude
altitude of the airport, in feet
Timezone
timezone of the airport, in hours offset from UTC
DST
Daylight savings time used at the airport. 'A'=US/Canada, 'N'=None.
Timezone2
name of the timezone of the airport
Source
Raw delays data:
Fields/Forms used in the raw data:
Airports (includes license information):
References
Hentschel M, Engelke S, Segers J (2022). “Statistical Inference for Hüsler-Reiss Graphical Models Through Matrix Completions.” doi:10.48550/ARXIV.2210.14292, https://arxiv.org/abs/2210.14292.
See Also
Other flight data related topics:
flightCountMatrixToConnectionList()
,
getFlightDelayData()
,
getFlightGraph()
,
plotFlights()
Other datasets:
danube
Examples
# Get total number of flights in the dataset:
totalFlightCounts <- apply(flights$flightCounts, c(1,2), sum)
# Get number of flights for specific years in the dataset:
flightCounts_10_11 <- apply(flights$flightCounts[,,c('2010', '2011')], c(1,2), sum)
# Get list of connections from 2008:
connections_10 <- flightCountMatrixToConnectionList(flights$flightCounts[,,'2010'])