BayesianSDF {BayesianFactorZoo} | R Documentation |
Bayesian estimation of Linear SDF (B-SDF)
Description
This function provides the Bayesian estimates of factors' risk prices. The estimates with the flat prior are given by Definitions 1 and 2 in Bryzgalova et al. (2023). The estimates with the normal prior are used in Table I (see the footnote of Table I).
Usage
BayesianSDF(
f,
R,
sim_length = 10000,
intercept = TRUE,
type = "OLS",
prior = "Flat",
psi0 = 5,
d = 0.5
)
Arguments
f |
A |
R |
A |
sim_length |
The length of MCMCs |
intercept |
If |
type |
If |
prior |
If |
psi0 |
The hyper-parameter of the prior distribution of risk prices |
d |
The hyper-parameter of the prior distribution of risk prices |
Details
Intercept
Consider the cross-sectional step. If one includes the intercept, the model is
\mu_R = \lambda_c 1_N + C_f \lambda_f = C \lambda,
where C = (1_N, C_f)
and \lambda^\top = (\lambda_c^\top, \lambda_f^\top)^\top
.
If one doesn't include the intercept, the model is
\mu_R = C_f \lambda_f = C \lambda,
where C = C_f
and \lambda = \lambda_f
.
Bayesian Estimation
Let Y_t = f_t \cup R_t
. Conditional on the data Y = \{Y_t\}_{t=1}^T
, we can draw \mu_{Y}
and \Sigma_{Y}
from the Normal-inverse-Wishart system
\mu_Y | \Sigma_Y, Y \sim N (\hat{\mu}_Y , \Sigma_Y / T) ,
\Sigma_Y | Y \sim W^{-1} (T-1, \Sigma_{t=1}^{T} (Y_t - \hat{\mu}_Y ) ( Y_t - \hat{\mu}_Y )^\top ) ,
where W^{-1}
is the inverse-Wishart distribution.
We do not standardize Y_t
in the time-series regression.
In the empirical implementation, after obtaining posterior draws for \mu_{Y}
and \Sigma_{Y}
,
we calculate \mu_R
and C_f
as the standardized expected returns of test assets and correlation
between test assets and factors. It follows that C
is a matrix containing a vector of ones and C_f
.
The prior distribution of risk prices is either the flat prior or the normal prior.
With prior = 'Flat'
and type = 'OLS'
, for each draw, the risk price estimate is
\hat{\lambda} = (C^{\top} C)^{-1}C^{T} \mu_{R} .
With prior = 'Flat'
and type = 'GLS'
, for each draw, the risk price estimate is
\hat{\lambda} = (C^{\top} \Sigma^{-1}_{R} C)^{-1} C^{\top} \Sigma^{-1}_{R} \mu_{R}
If one chooses prior = 'Normal'
, the prior of factor j
's risk price is
\lambda_j | \sigma^2 \sim N(0, \sigma^2 \psi \tilde{\rho}_j^\top \tilde{\rho}_j T^d ) ,
where \tilde{\rho}_j = \rho_j - (\frac{1}{N} \Sigma_{i=1}^{N} \rho_{j,i} ) \times 1_N
is the cross-sectionally
demeaned vector of factor j
's correlations with asset returns. Equivalently,
\lambda | \sigma^2 \sim N(0, \sigma^2 D^{-1}) ,
D = diag \{ (\psi \tilde{\rho}_1^\top \tilde{\rho}_1 T^d)^{-1}, ..., (\psi \tilde{\rho}_k^\top \tilde{\rho}_k T^d)^{-1} \} \ \ without \ intercept;
D = diag \{ c, (\psi \tilde{\rho}_1^\top \tilde{\rho}_1 T^d)^{-1}, ..., (\psi \tilde{\rho}_k^\top \tilde{\rho}_k T^d)^{-1} \} \ \ with \ intercept;
where c
is a small positive number corresponding to the common cross-sectional intercept (\lambda_c
).
Default values for \psi
(psi0
) and d
(d
) are 5 and 0.5, respectively.
With prior = 'Normal'
and type = 'OLS'
, for each draw, the risk price estimate is
\hat{\lambda} = ( C^{\top} C +D )^{-1} C^{\top} \mu_R .
With prior = 'Normal'
and type = 'GLS'
, for each draw, the risk price estimate is
\hat{\lambda} = ( C^{\top} \Sigma_R^{-1} C +D )^{-1} C^{\top} \Sigma_R^{-1} \mu_R .
Value
The return of BayesianSDF
is a list that contains the following elements:
-
lambda_path
: Asim_length
\times (k+1)
matrix if the intercept is included. NOTE: the first column\lambda_c
corresponds to the intercept. The nextk
columns (i.e., the 2th –(k+1)
-th columns) are the risk prices ofk
factors. If the intercept is excluded, the dimension oflambda_path
issim_length
\times k
. -
R2_path
: Asim_length
\times 1
matrix, which contains the posterior draws of the OLS or GLSR^2
.
References
Bryzgalova S, Huang J, Julliard C (2023). “Bayesian solutions for the factor zoo: We just ran two quadrillion models <https://doi.org/10.1111/jofi.13197>.” Journal of Finance, 78(1), 487–557.
Examples
## <-------------------------------------------------------------------------------->
## Example: Bayesian estimates of risk prices and R2
## This example is from the paper (see Section III. Simulation)
## <-------------------------------------------------------------------------------->
library(reshape2)
library(ggplot2)
# Load the example data
data("BFactor_zoo_example")
HML <- BFactor_zoo_example$HML
lambda_ols <- BFactor_zoo_example$lambda_ols
R2.ols.true <- BFactor_zoo_example$R2.ols.true
sim_f <- BFactor_zoo_example$sim_f
sim_R <- BFactor_zoo_example$sim_R
uf <- BFactor_zoo_example$uf
W_ols <- BFactor_zoo_example$W_ols
cat("Load the simulated example \n")
cat("Cross-section: Fama-French 25 size and value portfolios \n")
cat("True pricing factor in simulations: HML \n")
cat("Pseudo-true cross-sectional R-squared:", R2.ols.true, "\n")
cat("Pseudo-true (monthly) risk price:", lambda_ols[2], "\n")
cat("----------------------------- Bayesian SDF ----------------------------\n")
cat("------------------------ See definitions 1 and 2 ----------------------\n")
cat("--------------------- Bayesian SDF: Strong factor ---------------------\n")
sim_result <- SDF_gmm(sim_R, sim_f, W_ols) # GMM estimation
# sim_result$lambda_gmm
# sqrt(sim_result$Avar_hat[2,2])
# sim_result$R2_adj
## Now estimate the model using Bayesian method
two_step <- BayesianSDF(sim_f, sim_R, sim_length = 2000, psi0 = 5, d = 0.5)
# apply(X = two_step$lambda_path, FUN = quantile, MARGIN = 2, probs = c(0.05, 0.95))
# quantile(two_step$R2_path, probs = c(0.05, 0.5, 0.95))
# Note that the first element correspond to lambda of the constant term
# So we choose k=2 to get lambda of the strong factor
k <- 2
m1 <- sim_result$lambda_gmm[k]
sd1 <- sqrt(sim_result$Avar_hat[k,k])
bfm<-two_step$lambda_path[1001:2000, k]
fm<-rnorm(5000,mean = m1, sd=sd1)
data<-data.frame(cbind(fm, bfm))
colnames(data)<-c("GMM-OLS", "BSDF-OLS")
data.long<-melt(data)
#
### Figure 1(c)
#
p <- ggplot(aes(x=value, colour=variable, linetype=variable), data=data.long)
p+
stat_density(aes(x=value, colour=variable),
geom="line",position="identity", size = 2, adjust=1) +
geom_vline(xintercept = lambda_ols[2], linetype="dotted", color = "#8c8c8c", size=1.5)+
guides(colour = guide_legend(override.aes=list(size=2), title.position = "top",
title.hjust = 0.5, nrow=1,byrow=TRUE))+
theme_bw()+
labs(color=element_blank()) +
labs(linetype=element_blank()) +
theme(legend.key.width=unit(4,"line")) +
theme(legend.position="bottom")+
theme(text = element_text(size = 26))+
xlab(bquote("Risk price ("~lambda[strong]~")")) +
ylab("Density" )
cat("--------------------- Bayesian SDF: Useless factor --------------------\n")
sim_result <- SDF_gmm(sim_R, uf, W_ols)
# sim_result$lambda_gmm
# sqrt(sim_result$Avar_hat[2,2])
# sim_result$R2_adj
two_step <- BayesianSDF(uf, sim_R, sim_length = 2000, psi0 = 5, d = 0.5)
#apply(X = two_step$lambda_path, FUN = quantile, MARGIN = 2, probs = c(0.05, 0.95))
## Posterior (Asymptotic) Distribution of lambda
k <- 2
m1 <- sim_result$lambda[k]
sd1 <- sqrt(sim_result$Avar_hat[k,k])
bfm<-two_step$lambda_path[1001:2000, k]
fm<-rnorm(5000,mean = m1, sd=sd1)
data<-data.frame(cbind(fm, bfm))
colnames(data)<-c("GMM-OLS", "BSDF-OLS")
data.long<-melt(data)
#
### Figure 1(a)
#
p <- ggplot(aes(x=value, colour=variable, linetype=variable), data=data.long)
p+
stat_density(aes(x=value, colour=variable),
geom="line",position="identity", size = 2, adjust=2) +
geom_vline(xintercept = 0, linetype="dotted", color = "#8c8c8c", size=1.5)+
guides(colour = guide_legend(override.aes=list(size=2),
title.position = "top", title.hjust = 0.5, nrow=1,byrow=TRUE))+
theme_bw()+
labs(color=element_blank()) +
labs(linetype=element_blank()) +
theme(legend.key.width=unit(4,"line")) +
theme(legend.position="bottom")+
theme(text = element_text(size = 26))+
xlab(bquote("Risk price ("~lambda[spurious]~")")) +
ylab("Density" )