Overparameterization in Sum-Product Networks

Project Type: Master/Diploma Thesis
Project Status: Open

One of the main reasons for the recent success of deep neural networks is their overparameterization. In recent work [1], it has been shown that overparameterization in deep linear neural networks and linear least squares models results in an implicit acceleration of gradient descent. This finding is of particular relevance, as gradient descent methods are the go-to approach to learn the parameters of neural networks.

On supprising finding of this work is that overparametrization of linear least squares acts as an implicit realization of gradient descent with an additional time depending momentum term. 

Sum-product networks (SPN) are deeply structured mixtures which can be interpreted as a neural network with weighted sum and product operations. 

The aim of this project is to theoretically and/or empirically examine the effects of overparametrization in SPNs.

Reference: [1] Sanjeev Arora, Nadav Cohen, Elad Hazan: On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization, ICML 2018.

Requirements: The candidate should have experience in machine learning and should be willing to workout mathematical problems. Ideally, the candidate should have prior experience with the programming language Julia, this is however not a requirement. 

Questions: If you have questions regarding the topic, feel free to contact Martin Trapp (martin.trapp@tugraz.at)