mpost.euc {SBmedian} | R Documentation |
Median Posterior for Subset Posterior Samples in Euclidean Space
Description
mpost.euc
is a general framework to merge multiple
empirical measures Q_1,Q_2,\ldots,Q_M \subset R^p
from independent subset of data by finding a median
\hat{Q} = \textrm{argmin}_Q \sum_{m=1}^M d(Q,Q_m)
where Q
is a weighted combination and d(P_1,P_2)
is distance in RKHS between two empirical measures P_1
and P_2
.
As in the references, we use RBF kernel with bandwidth parameter \sigma
.
Usage
mpost.euc(
splist,
sigma = 0.1,
maxiter = 121,
abstol = 1e-06,
show.progress = FALSE
)
Arguments
splist |
a list of length |
sigma |
bandwidth parameter for RBF kernel. |
maxiter |
maximum number of iterations for Weiszfeld algorithm. |
abstol |
stopping criterion for Weiszfeld algorithm. |
show.progress |
a logical; |
Value
a named list containing:
- med.atoms
a vector or matrix of all atoms aggregated.
- med.weights
a weight vector that sums to 1 corresponding to
med.atoms
.- weiszfeld.weights
a weight for
M
subset posteriors.- weiszfeld.history
updated parameter values. Each row is for iteration, while columns are weights corresponding to
weiszfeld.weights
.
References
Minsker S, Srivastava S, Lin L, Dunson DB (2014). “Scalable and Robust Bayesian Inference via the Median Posterior.” In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32, ICML’14, II–1656–II–1664. event-place: Beijing, China.
Minsker S, Srivastava S, Lin L, Dunson DB (2017). “Robust and Scalable Bayes via a Median of Subset Posterior Measures.” Journal of Machine Learning Research, 18(124), 1–40. https://jmlr.org/papers/v18/16-655.html.
Examples
## Median Posteior from 2-D Gaussian Samples
# Step 1. let's build a list of atoms whose numbers differ
set.seed(8128) # for reproducible results
mydata = list()
mydata[[1]] = cbind(rnorm(96, mean= 1), rnorm(96, mean= 1))
mydata[[2]] = cbind(rnorm(78, mean=-1), rnorm(78, mean= 0))
mydata[[3]] = cbind(rnorm(65, mean=-1), rnorm(65, mean= 1))
mydata[[4]] = cbind(rnorm(77, mean= 2), rnorm(77, mean=-1))
# Step 2. Let's run the algorithm
myrun = mpost.euc(mydata, show.progress=TRUE)
# Step 3. Visualize
# 3-1. show subset posterior samples
opar <- par(no.readonly=TRUE)
par(mfrow=c(2,3), no.readonly=TRUE)
for (i in 1:4){
plot(mydata[[i]], cex=0.5, col=(i+1), pch=19, xlab="", ylab="",
main=paste("subset",i), xlim=c(-4,4), ylim=c(-3,3))
}
# 3-2. 250 median posterior samples via importance sampling
id250 = base::sample(1:nrow(myrun$med.atoms), 250, prob=myrun$med.weights, replace=TRUE)
sp250 = myrun$med.atoms[id250,]
plot(sp250, cex=0.5, pch=19, xlab="", ylab="",
xlim=c(-4,4), ylim=c(-3,3), main="median samples")
# 3-3. convergence over iterations
matplot(myrun$weiszfeld.history, xlab="iteration", ylab="value",
type="b", main="convergence of weights")
par(opar)