Improving inference runtime for Dirichlet Process Bayesian Neural Networks

home › theses & projects › Improving inference runtime for Dirichlet Process Bayesian Neural Networks

Improving inference runtime for Dirichlet Process Bayesian Neural Networks

Status

Finished

Type

Master Thesis

Announcement date

18 Oct 2019

Mentors

Research Areas

Intelligent Systems

Short Description:

Dirichlet processes are an elegant tool to enforce parameter sharing in Bayesian models. Their most prominent use case is the generalization of (Gaussian) mixture models to an a-priori unknown number of mixture components such that an appropriate number is obtained during inference [1]. Recently, they have been used to enforce weight sharing in Bayesian neural networks to reduce the memory overhead for storing a large ensemble [2]. However, Bayesian inference using a sampling based approach is slow and does not scale to larger datasets.

In this thesis, different approximation to speed up inference should be implemented and evaluated on benchmark datasets.

Requirements:

Good programming skills (Matlab or Python; preferably experience with numpy and Theano/Tensorflow or similar frameworks)
Strong interest in Machine Learning, especially in Bayesian inference (at least absolved CI/EW or similar lectures, ideally having experience with neural networks)
Basic knowledge about probability calculus

[1] S. J. Gershman and D. M. Blei, “A tutorial on Bayesian nonparametric models,” Journal of Mathematical Psychology, vol. 56, no. 1, pp. 1-12, 2012

[2] W. Roth and F. Pernkopf. “Bayesian neural networks with weight sharing using Dirichlet processes.”, IEEE transactions on pattern analysis and machine intelligence (2018)