Improving inference runtime for Dirichlet Process Bayesian Neural Networks
- Status
- Finished
- Type
- Master Thesis
- Announcement date
- 18 Oct 2019
- Mentors
- Research Areas
Short Description:
Dirichlet processes are an elegant tool to enforce parameter sharing in Bayesian models. Their most prominent use case is the generalization of (Gaussian) mixture models to an a-priori unknown number of mixture components such that an appropriate number is obtained during inference [1]. Recently, they have been used to enforce weight sharing in Bayesian neural networks to reduce the memory overhead for storing a large ensemble [2]. However, Bayesian inference using a sampling based approach is slow and does not scale to larger datasets.
In this thesis, different approximation to speed up inference should be implemented and evaluated on benchmark datasets.
Requirements:
- Good programming skills (Matlab or Python; preferably experience with numpy and Theano/Tensorflow or similar frameworks)
- Strong interest in Machine Learning, especially in Bayesian inference (at least absolved CI/EW or similar lectures, ideally having experience with neural networks)
- Basic knowledge about probability calculus
[1] S. J. Gershman and D. M. Blei, “A tutorial on Bayesian nonparametric models,” Journal of Mathematical Psychology, vol. 56, no. 1, pp. 1-12, 2012
[2] W. Roth and F. Pernkopf. “Bayesian neural networks with weight sharing using Dirichlet processes.”, IEEE transactions on pattern analysis and machine intelligence (2018)