R: Estimation of adjusted PIN model

adjpin {PINstimation}

R Documentation

Estimation of adjusted PIN model

Description

Estimates the Adjusted Probability of Informed Trading (adjPIN) as well as the Probability of Symmetric Order-flow Shock (PSOS) from the AdjPIN model of Duarte and Young(2009).

Usage

adjpin(data, method = "ECM", initialsets = "GE", num_init = 20,
              restricted = list(), ..., verbose = TRUE)

Arguments

`data`	A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).
`method`	A character string referring to the method used to estimate the model of Duarte and Young (2009). It takes one of two values: `"ML"` refers to the standard maximum likelihood estimation, and `"ECM"` refers to the expectation-conditional maximization algorithm. The default value is `"ECM"`. Details of the ECM method, and comparative results can be found in Ghachem and Ersan (2022a), and in Ghachem and Ersan (2022b).
`initialsets`	It can either be a character string referring to prebuilt algorithms generating initial parameter sets or a dataframe containing custom initial parameter sets. If `initialsets` is a character string, it refers to the method of generation of the initial parameter sets, and takes one of three values: `"GE"`, `"CL"`, or `"RANDOM"`. `"GE"` refers to initial parameter sets generated by the algorithm of Ersan and Ghachem (2022b), and implemented in `initials_adjpin()`, `"CL"` refers to initial parameter sets generated by the algorithm of Cheng and Lai (2021), and implemented in `initials_adjpin_cl()`, while `"RANDOM"` generates random initial parameter sets as implemented in `initials_adjpin_rnd()`. The default value is `"GE"`. If `initialsets` is a dataframe, the function `adjpin()` will estimate the AdjPIN model using the provided initial parameter sets.
`num_init`	An integer specifying the maximum number of initial parameter sets to be used in the estimation. If `initialsets="GE"`, the generation of initial parameter sets will stop when the number of initial parameter sets reaches `num_init`. It can stop earlier if the number of all possible generated initial parameter sets is lower than `num_init`. If `initialsets="RANDOM"`, exactly `num_init` initial parameter sets are returned. If `initialsets="CL"`: then `num_init` is ignored, and all `256` initial parameter sets are used. The default value is `20`. `⁠[i]⁠` The argument `num_init` is ignored when the argument `initialsets` is a dataframe.
`restricted`	A binary list that allows estimating restricted AdjPIN models by specifying which model parameters are assumed to be equal. It contains one or multiple of the following four elements `⁠{theta, mu, eps, d}⁠`. For instance, If `theta` is set to `TRUE`, then the probability of liquidity shock in no-information days, and in information days is assumed to be the same (`\theta=\theta'`). If any of the remaining rate elements `⁠{mu, eps, d}⁠` is set to `TRUE`, (say `mu=TRUE`), then the rate is assumed to be the same on the buy side, and on the sell side (`\mu`_b`=\mu`_s). If more than one element is set to `TRUE`, then the restrictions are combined. For instance, if the argument `restricted` is set to `list(theta=TRUE, eps=TRUE, d=TRUE)`, then the restricted AdjPIN model is estimated, where `\theta=\theta'`, `\epsilon`_b`=\epsilon`_s, and `\Delta`_b`=\Delta`_s. If the value of the argument `restricted` is the empty list (`list()`), then all parameters of the model are assumed to be independent, and the unrestricted model is estimated. The default value is the empty list `list()`.
`...`	Additional arguments passed on to the function `adjpin()`. The recognized arguments are `hyperparams`, and `fact`. The argument `hyperparams` consists of a list containing the hyperparameters of the `ECM` algorithm. When not empty, it contains one or more of the following elements: `maxeval`, and `tolerance`. It is used only when the `method` argument is set to `"ECM"`. The argument `fact` is a binary value that determines which likelihood functional form is used: A factorization of the likelihood function by Ersan and Ghachem (2022b) when it is set to `TRUE`, otherwise, the original likelihood function of Duarte and Young (2009). The default value is `TRUE`. More about these arguments are in the Details section.
`verbose`	A binary variable that determines whether detailed information about the steps of the estimation of the AdjPIN model is displayed. No output is produced when `verbose` is set to `FALSE`. The default value is `TRUE`.

Details

The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

If initialsets is neither a dataframe, nor a character string from the set ⁠{"GE",⁠ ⁠"CL",⁠ ⁠"RANDOM"}⁠, the estimation of the AdjPIN model is aborted. The default initial parameters ("GE") for the estimation method are generated using a modified hierarchical agglomerative clustering. For more information, see initials_adjpin().

The argument hyperparams contains the hyperparameters of the ECM algorithm. It is either empty or contains one or two of the following elements:

maxeval: (integer) It stands for maximum number of iterations of the ECM algorithm for each initial parameter set. When missing, maxeval takes the default value of 100.
tolerance (numeric) The ECM algorithm is stopped when the (relative) change of log-likelihood is smaller than tolerance. When missing, tolerance takes the default value of 0.001.

Value

Returns an object of class estimate.adjpin.

References

Cheng T, Lai H (2021). “Improvements in estimating the probability of informed trading models.” Quantitative Finance, 21(5), 771-796.

Duarte J, Young L (2009). “Why is PIN priced?” Journal of Financial Economics, 91(2), 119–138. ISSN 0304405X.

Ersan O, Ghachem M (2022b). “A methodological approach to the computational problems in the estimation of adjusted PIN model.” Available at SSRN 4117954.

Ghachem M, Ersan O (2022a). “Estimation of the probability of informed trading models via an expectation-conditional maximization algorithm.” Available at SSRN 4117952.

Ghachem M, Ersan O (2022b). “PINstimation: An R package for estimating models of probability of informed trading.” Available at SSRN 4117946.

Examples

# We use 'generatedata_adjpin()' to generate a S4 object of type 'dataset'
# with 60 observations.

sim_data <- generatedata_adjpin(days = 60)

# The actual dataset of 60 observations is stored in the slot 'data' of the
# S4 object 'sim_data'. Each observation corresponds to a day and contains
# the total number of buyer-initiated transactions ('B') and seller-
# initiated transactions ('S') on that day.

xdata <- sim_data@data

# ------------------------------------------------------------------------ #
# Compare the unrestricted AdjPIN model with various restricted models     #
# ------------------------------------------------------------------------ #

# Estimate the unrestricted AdjPIN model using the ECM algorithm (default),
# and show the estimation output

estimate.adjpin.0 <- adjpin(xdata, verbose = FALSE)

show(estimate.adjpin.0)

# Estimate the restricted AdjPIN model where mub=mus

estimate.adjpin.1 <- adjpin(xdata, restricted = list(mu = TRUE),
                                  verbose = FALSE)

# Estimate the restricted AdjPIN model where eps.b=eps.s

estimate.adjpin.2 <- adjpin(xdata, restricted = list(eps = TRUE),
                                  verbose = FALSE)

# Estimate the restricted AdjPIN model where d.b=d.s

estimate.adjpin.3 <- adjpin(xdata, restricted = list(d = TRUE),
                                  verbose = FALSE)

# Compare the different values of adjusted PIN

estimates <- list(estimate.adjpin.0, estimate.adjpin.1,
                  estimate.adjpin.2, estimate.adjpin.3)

adjpins <- sapply(estimates, function(x) x@adjpin)

psos <- sapply(estimates, function(x) x@psos)

summary <- cbind(adjpins, psos)
rownames(summary) <- c("unrestricted", "same.mu", "same.eps", "same.d")

show(round(summary, 5))

[Package PINstimation version 0.1.2 Index]