tradesCleanupUsingQuotes {highfrequency} | R Documentation |
Perform a final cleaning procedure on trade data
Description
Function performs cleaning procedure rmTradeOutliersUsingQuotes
for the trades of all stocks data in "dataDestination".
Note that preferably the input data for this function
is trade and quote data cleaned by respectively e.g. tradesCleanup
and quotesCleanup
.
Usage
tradesCleanupUsingQuotes(
tradeDataSource = NULL,
quoteDataSource = NULL,
dataDestination = NULL,
tData = NULL,
qData = NULL,
lagQuotes = 0,
nSpreads = 1,
BFM = FALSE,
backwardsWindow = 3600,
forwardsWindow = 0.5,
plot = FALSE
)
Arguments
tradeDataSource |
character indicating the folder in which the original trade data is stored. |
quoteDataSource |
character indicating the folder in which the original quote data is stored. |
dataDestination |
character indicating the folder in which the cleaned data is stored, folder of |
tData |
|
qData |
|
lagQuotes |
numeric, number of seconds the quotes are registered faster than the trades (should be round and positive). Default is 0. For older datasets, i.e. before 2010, it may be a good idea to set this to, e.g., 2 (see, Vergote, 2005). |
nSpreads |
numeric of length 1 denotes how far above the offer and below bid we allow outliers to be. Trades are filtered out if they are MORE THAN nSpread * spread above (below) the offer (bid) |
BFM |
a logical determining whether to conduct "Backwards - Forwards matching" of trades and quotes. The algorithm tries to match trades that fall outside the bid - ask and first tries to match a small window forwards and if this fails, it tries to match backwards in a bigger window. The small window is a tolerance for inaccuracies in the timestamps of bids and asks. The backwards window allow for matching of late reported trades, i.e. block trades. |
backwardsWindow |
a numeric denoting the length of the backwards window used when |
forwardsWindow |
a numeric denoting the length of the forwards window used when |
plot |
a logical denoting whether to visualize the forwards, backwards, and unmatched trades in a plot. Passed on to |
Details
In case you supply the arguments tData
and qData
, the on-disk functionality is ignored
and the function returns cleaned trades as a data.table
or xts
object (see examples).
When using the on-disk functionality and tradeDataSource and quoteDataSource are the same, the quote files are all files in the folder that contains 'quote', and the rest are treated as containing trade data.
Value
For each day an xts
object is saved into the folder of that date, containing the cleaned data.
Author(s)
Jonathan Cornelissen, Kris Boudt, Onno Kleen, and Emil Sjoerup.
References
Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., and Shephard, N. (2009). Realized kernels in practice: Trades and quotes. Econometrics Journal, 12, C1-C32.
Brownlees, C.T., and Gallo, G.M. (2006). Financial econometric analysis at ultra-high frequency: Data handling concerns. Computational Statistics & Data Analysis, 51, 2232-2245.
Christensen, K., Oomen, R. C. A., Podolskij, M. (2014): Fact or Friction: Jumps at ultra high frequency. Journal of Financial Economics, 144, 576-599
Examples
# Consider you have raw trade data for 1 stock for 2 days
## Not run:
tDataAfterFirstCleaning <- tradesCleanup(tDataRaw = sampleTDataRaw,
exchanges = "N", report = FALSE)
qData <- quotesCleanup(qDataRaw = sampleQDataRaw,
exchanges = "N", report = FALSE)
dim(tDataAfterFirstCleaning)
tDataAfterFinalCleaning <-
tradesCleanupUsingQuotes(qData = qData[as.Date(DT) == "2018-01-02"],
tData = tDataAfterFirstCleaning[as.Date(DT) == "2018-01-02"])
dim(tDataAfterFinalCleaning)
## End(Not run)
# In case you have more data it is advised to use the on-disk functionality
# via the "tradeDataSource", "quoteDataSource", and "dataDestination" arguments