R: Reads certain messages of an ITCH-file into a data.table

read_functions {RITCH}

R Documentation

Reads certain messages of an ITCH-file into a data.table

Description

For faster file-reads (at the tradeoff of increased memory usages), you can increase the buffer_size to 1GB (1e9) or more.

If you access the same file multiple times, you can provide the message counts as outputted from count_messages() to the n_max argument, this allows skipping one pass over the file per read instruction.

If you need to read in multiple message classes, you can specify multiple message classes to read_itch, which results in only a single file pass.

If the file is too large to be loaded into the workspace at once, you can specify different skip and n_max to load only a specific range of messages. Alternatively, you can filter certain messages to another file using filter_itch(), which is substantially faster than parsing a file and filtering it.

Note that all read functions allow both plain ITCH files as well as gzipped files. If a gzipped file is found, it will look for a plain ITCH file with the same name and use that instead. If this file is not found, it will be created by unzipping the archive. Note that the unzipped file is NOT deleted by default (the file will be created in the current working directory). It might result in increased disk usage but reduces future read times for that specific file. To force RITCH to delete "temporary" files after uncompressing, use force_cleanup = TRUE (only deletes the files if they were extracted before, does not remove the archive itself).

Usage

read_itch(
  file,
  filter_msg_class = NA,
  skip = 0,
  n_max = -1,
  filter_msg_type = NA_character_,
  filter_stock_locate = NA_integer_,
  min_timestamp = bit64::as.integer64(NA),
  max_timestamp = bit64::as.integer64(NA),
  filter_stock = NA_character_,
  stock_directory = NA,
  buffer_size = -1,
  quiet = FALSE,
  add_meta = TRUE,
  force_gunzip = FALSE,
  gz_dir = tempdir(),
  force_cleanup = TRUE
)

read_system_events(file, ..., add_descriptions = FALSE)

read_stock_directory(file, ..., add_descriptions = FALSE)

read_trading_status(file, ..., add_descriptions = FALSE)

read_reg_sho(file, ..., add_descriptions = FALSE)

read_market_participant_states(file, ..., add_descriptions = FALSE)

read_mwcb(file, ...)

read_ipo(file, ..., add_descriptions = FALSE)

read_luld(file, ...)

read_orders(file, ...)

read_modifications(file, ...)

read_trades(file, ...)

read_noii(file, ..., add_descriptions = FALSE)

read_rpii(file, ..., add_descriptions = FALSE)

get_orders(file, ...)

get_trades(file, ...)

get_modifications(file, ...)

Arguments

`file`	the path to the input file, either a gz-archive or a plain ITCH file
`filter_msg_class`	a vector of classes to load, can be "orders", "trades", "modifications", ... see also `get_msg_classes()`. Default value is to take all message classes.
`skip`	Number of messages to skip before starting parsing messages, note the skip parameter applies to the specific message class, i.e., it would skip the messages for each type (e.g., skip the first 10 messages for each class).
`n_max`	Maximum number of messages to parse, default is to read all values. Can also be a data.frame of msg_types and counts, as returned by `count_messages()`. Note the n_max parameter applies to the specific message class not the whole file.
`filter_msg_type`	a character vector, specifying a filter for message types. Note that this can be used to only return 'A' orders for instance.
`filter_stock_locate`	an integer vector, specifying a filter for locate codes. The locate codes can be looked up by calling `read_stock_directory()` or by downloading from NASDAQ by using `download_stock_directory()`. Note that some message types (e.g., system events, MWCB, and IPO) do not use a locate code.
`min_timestamp`	an 64 bit integer vector (see also `bit64::as.integer64()`) of minimum timestamp (inclusive). Note: min and max timestamp must be supplied with the same length or left empty.
`max_timestamp`	an 64 bit integer vector (see also `bit64::as.integer64()`) of maxium timestamp (inclusive). Note: min and max timestamp must be supplied with the same length or left empty.
`filter_stock`	a character vector, specifying a filter for stocks. Note that this a shorthand for the `filter_stock_locate` argument, as it tries to find the stock_locate based on the `stock_directory` argument, if this is not found, it will try to extract the stock directory from the file, else an error is thrown.
`stock_directory`	A data.frame containing the stock-locate code relationship. As outputted by `read_stock_directory()`. Only used if `filter_stock` is set. To download the stock directory from NASDAQs server, use `download_stock_directory()`.
`buffer_size`	the size of the buffer in bytes, defaults to 1e8 (100 MB), if you have a large amount of RAM, 1e9 (1GB) might be faster
`quiet`	if TRUE, the status messages are suppressed, defaults to FALSE
`add_meta`	if TRUE, the date and exchange information of the file are added, defaults to TRUE
`force_gunzip`	only applies if the input file is a gz-archive and a file with the same (gunzipped) name already exists. if set to TRUE, the existing file is overwritten. Default value is FALSE
`gz_dir`	a directory where the gz archive is extracted to. Only applies if file is a gz archive. Default is `tempdir()`.
`force_cleanup`	only applies if the input file is a gz-archive. If force_cleanup=TRUE, the gunzipped raw file will be deleted afterwards. Only applies when the gunzipped raw file did not exist before.
`...`	Additional arguments passed to `read_itch`
`add_descriptions`	add longer descriptions to shortened variables. The added information is taken from the official ITCH documentation see also `open_itch_specification()`

Details

The details of the different messages types can be found in the official ITCH specification (see also open_itch_specification())

read_itch: Reads a message class message, can also read multiple classes in one file-pass.

read_system_events: Reads system event messages. Message type S

read_stock_directory: Reads stock trading messages. Message type R

read_trading_status: Reads trading status messages. Message type H and h

read_reg_sho: Reads messages regarding reg SHO. Message type Y

read_market_participant_states: Reads messages regarding the status of market participants. Message type L

read_mwcb: Reads messages regarding Market-Wide-Circuit-Breakers (MWCB). Message type V and W

read_ipo: Reads messages regarding IPOs. Message type K

read_luld: Reads messages regarding LULDs (limit up-limit down) auction collars. Message type J

read_orders: Reads order messages. Message type A and F

read_modifications: Reads order modification messages. Message type E, C, X, D, and U

read_trades: Reads trade messages. Message type P, Q and B

read_noii: Reads Net Order Imbalance Indicatio (NOII) messages. Message type I

read_rpii: Reads Retail Price Improvement Indicator (RPII) messages. Message type N

For backwards compatability reasons, the following functions are provided as well:

get_orders: Redirects to read_orders

get_trades: Redirects to read_trades

get_modifications: Redirects to read_modifications

Value

a data.table containing the messages

References

https://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/NQTVITCHspecification.pdf

Examples


file <- system.file("extdata", "ex20101224.TEST_ITCH_50", package = "RITCH")
od <- read_orders(file, quiet = FALSE) # note quiet = FALSE is the default
tr <- read_trades(file, quiet = TRUE)

## Alternatively
od <- read_itch(file, "orders", quiet = TRUE)

ll <- read_itch(file, c("orders", "trades"), quiet = TRUE)

od
tr
str(ll, max.level = 1)

## additional options:

# take only subset of messages
od <- read_orders(file, skip = 3, n_max = 10)

# a message count can be provided for slightly faster reads
msg_count <- count_messages(file, quiet = TRUE)
od <- read_orders(file, n_max = msg_count)

## .gz archive functionality
# .gz archives will be automatically unzipped
gz_file <- system.file("extdata", "ex20101224.TEST_ITCH_50.gz", package = "RITCH")
od <- read_orders(gz_file)
# force a decompress and delete the decompressed file afterwards
od <- read_orders(gz_file, force_gunzip = TRUE, force_cleanup = TRUE)

## read_itch()
otm <- read_itch(file, c("orders", "trades"), quiet = TRUE)
str(otm, max.level = 1)

## read_system_events()
se <- read_system_events(file, add_descriptions = TRUE, quiet = TRUE)
se

## read_stock_directory()
sd <- read_stock_directory(file, add_descriptions = TRUE, quiet = TRUE)
sd

## read_trading_status()
ts <- read_trading_status(file, add_descriptions = TRUE, quiet = TRUE)
ts

## read_reg_sho()
## Not run: 
# note the example file has no reg SHO messages
rs <- read_reg_sho(file, add_descriptions = TRUE, quiet = TRUE)
rs

## End(Not run)

## read_market_participant_states()
## Not run: 
# note the example file has no market participant states
mps <- read_market_participant_states(file, add_descriptions = TRUE,
                                      quiet = TRUE)
mps

## End(Not run)

## read_mwcb()
## Not run: 
# note the example file has no circuit breakers messages
mwcb <- read_mwcb(file, quiet = TRUE)
mwcb

## End(Not run)

## read_ipo()
## Not run: 
# note the example file has no IPOs
ipo <- read_ipo(file, add_descriptions = TRUE, quiet = TRUE)
ipo

## End(Not run)

## read_luld()
## Not run: 
# note the example file has no LULD messages
luld <- read_luld(file, quiet = TRUE)
luld

## End(Not run)

## read_orders()
od <- read_orders(file, quiet = TRUE)
od

## read_modifications()
mod <- read_modifications(file, quiet = TRUE)
mod

## read_trades()
tr <- read_trades(file, quiet = TRUE)
tr

## read_noii()
## Not run: 
# note the example file has no NOII messages
noii <- read_noii(file, add_descriptions = TRUE, quiet = TRUE)
noii

## End(Not run)

## read_rpii()
## Not run: 
# note the example file has no RPII messages
rpii <- read_rpii(file, add_descriptions = TRUE, quiet = TRUE)
rpii

## End(Not run)

[Package RITCH version 0.1.26 Index]