ycevo {ycevo}R Documentation

Estimate yield function



Nonparametric estimation of discount functions and yield curves at given dates, time-to-maturities, and one additional covariate, usually interest rate.


  span_x = 60,
  hx = NULL,
  tau = NULL,
  ht = NULL,
  tau_p = tau,
  htp = NULL,
  cols = NULL,

  tau_p = tau,
  htp = ht,
  rgrid = NULL,
  hr = NULL,
  interest = NULL,
  cfp_slist = NULL



Data frame; bond data to estimate discount curve from. See ycevo_data() for an example bond data structure. Minimum required columns are qdate, id, price, tupq, and pdint. The columns can be named differently: see cols.


Time grids at which the discount curve is evaluated. Should be specified using the same class of object as the quotation date (qdate) column in data.


Half of the window size, or the distance from the centre x to the maximum (or the minimum) qdate with non-zero weight using the kernel function, measured by the number of regular interval between two consecutive qdate. Ignored if hx is specified. See Details.


Numeric vector. Bandwidth parameters corresponding to each time point x.


Numeric vector. Time-to-maturities in years where discount function and yield curve will be estimated for each of time points x. See Details.


Numeric vector. Bandwidth parameters corresponding to each value of time-to-maturities tau. See Details.


Numeric vector. Auxiliary time-to-maturities in years. See Details.


Numeric vector. Bandwidth parameters corresponding to each of auxiliary time-to-maturities tau_p. See Details.


<tidy-select> A named list or vector of alternative names of required variables, following the new_name = old_name syntax of the dplyr::rename(), where the new_nam takes one of the five column names required in data. This enables the user to provide data with columns named differently from required.


Specification of an additional covariate, taking the form of var = list(grid, bandwidth), where var is the name of the covariate in data, grid is the values at which the yield curve is estimated, similar to x, and bandwidth is the bandwidth parameter corresponding to each of the grid values, similar to hx.


Numeric vector. Values between 0 and 1. Time grids over the entire time horizon (percentile) of the data at which the discount function is evaluated.


(Optional) Numeric vector. Interest rate grids in percentage at which the discount function is evaluated, e.g. 4.03 means at interest rate of 4.03%.


(Optional) Numeric vector. Bandwidth parameter in percentage determining the size of the window in the kernel function that corresponds to each interest rate grid ('rgrid').


(Optional) Numeric vector. Daily short term interest rates. The length is the same as the number of quotation dates included in the data, i.e. one interest rate per day.


(Internal) Experienced users only. A list of matrices, generated by the internal function 'get_cfp_slist'.


Suppose that a bond ii has a price pip_i at time tt with a set of cash payments, say c1,c2,,cmc_1, c_2, \ldots, c_m with a set of corresponding discount values d1,d2,,dmd_1, d_2, \ldots, d_m. In the bond pricing literature, the market price of a bond should reflect the discounted value of cash payments. Thus, we want to minimise

(pij=1mcj×dj)2.(p_i-\sum^m_{j=1}c_j\times d_j)^2.

For the estimation of dk(k=1,,m)d_k(k=1, \ldots, m), solving the first order condition yields

(pij=1mcj×dj)ck=0,(p_i-\sum^m_{j=1}c_j \times d_j)c_k = 0,


d^k=pickck2j=1,kkmckcjdjck2.\hat{d}_k = \frac{p_i c_k}{c_k^2} - \frac{\sum^m_{j=1,k\neq k}c_k c_j d_j}{c_k^2}.

There are challenges: d^k\hat{d}_k depends on all the relevant discount values for the cash payments of the bond. Our model contains random errors and our interest lies in expected value of d(.)d(.) where the expected value of errors is zero. d(.)d(.) is an infinite-dimensional function not a discrete finite-dimensional vector. Generally, cash payments are made biannually, not dense at all. Moreover, cash payment schedules vary over different bonds.

Let d(τ,Xt)d(\tau, X_t) be the discount function at given covariates XtX_t (dates x and interest rates rgrid), and given time-to-maturities τ\tau (tau). y(τ,Xt)y(\tau, X_t) is the yield curve at given covariates XtX_t (dates x and interest rates rgrid), and given time-to-maturities τ\tau (tau).

We pursue the minimum of the following smoothed sample least squares objective function for any smooth function d(.)d(.):

Q(d)=t=1Ti=1n{pitj=1mitcit(τij)d(sij,x)}2k=1mit{Kh(sikτik)dsik}Kh(xXt)dx,Q(d) = \sum^T_{t=1}\sum^n_{i=1}\int\{p_{it}-\sum^{m_{it}}_{j=1}c_{it}(\tau_{ij})d(s_{ij}, x)\}^2 \sum^{m_{it}}_{k=1}\{K_h(s_{ik}-\tau_{ik})ds_{ik}\}K_h(x-X_t)dx,

where a bond ii has a price pip_i at time tt with a set of cash payments c1,c2,,cmc_1, c_2, \ldots, c_m with a set of corresponding discount values d1,d2,,dmd_1, d_2, \ldots, d_m, Kh(.)=K(./h)K_h(.) = K(./h) is the kernel function with a bandwidth parameter hh, the first kernel function is the kernel in space with bonds whose maturities siks_{ik} are close to the sequence τik\tau_{ik}, the second kernel function is the kernel in time and in interest rates with xx, which are close to the sequence XtX_t. This means that bonds with similar cash flows, and traded in contiguous days, where the short term interest rates in the market are similar, are combined for the estimation of the discount function at a point in space, in time, and in "interest rates".

The estimator for the discount function over time to maturity and time is

d^=argmindQ(d).\hat{d}=\arg\min_d Q(d).

This function provides a data frame of the estimated yield and discount rate at each combination of the provided grids. The estimated yield is transformed from the estimated discount rate.

An alternative specification of bandwidth hx is span_x, which provides kernel coverage invariant to the length of data. span_x takes an absolute measure of time depending on the unit of x. The default value is 60. If the data is daily on trading days, i.e., the interval between every two consecutive qdate is one trading day, then the window of the kernel function allows the estimation at each point x to contain information from 60 trading days prior to and after the time point x.

For more information on the estimation method, please refer to References.


A tibble::tibble() object of class ycevo with the following columns.


The time points that user-specified as x. The name of this column will be consistent with the name of the time index column in the data input, if the user choose to provide a data frame with the time index column named differently from qdate with the cols argument.


A nested columns of estimation results containing a tibble::tibble() for each qdate. Each tibble contains three columns: tau for the time-to-maturity specified by the user in the tau argument, .disount for the estimated discount function at this time and this time-to-maturity, and .yield for the estimated yield curve.



Koo, B., La Vecchia, D., & Linton, O. (2021). Estimation of a nonparametric model for bond prices from cross-section and time series information. Journal of Econometrics, 220(2), 562-588.

See Also

augment.ycevo(), autoplot.ycevo()


# Simulating bond data
bonds <- ycevo_data(n = 10)

# Estimation can take up to 30 seconds
ycevo(bonds, x = lubridate::ymd("2023-03-01"))

[Package ycevo version 0.2.1 Index]