Signal Processing and Speech Communication Laboratory
homephd theses › Improving Efficiency and Generalization in Deep Learning Models for Industrial Applications

Improving Efficiency and Generalization in Deep Learning Models for Industrial Applications

Status
Finished
Student
Alex Fuchs
Mentor
Franz Pernkopf
Research Areas

Over the last decade deep learning methods have gained increasing traction in industrial applications, ranging from image-based automated quality control, over signal enhancement to condition monitoring tasks. While deep learning has immensely in creased the performance and capabilities of machine learning models, it also increased the vulnerability of those models. Moreover, these models require vast amounts of data in order to generalize well. This is problematic for industrial applications, since the amount of available data is often limited and most practical applications outside the field of big data have to deal with scarce data. This is especially true for supervised tasks, as creating labeled dataset often involves expensive expert labour. In contrast, big data methods can rely on increasingly large datasets, solving the problem of generalization on a data level, allowing for even bigger and more flexible models that can further push model performance. However, for small scale applications these models often suffer from a large decrease in performance due to distributions shifts, which effect the brittle representations learned in deep neural networks. This limits the applicability to production or safety critical applications, and simpler machine-learning methods are often preferred, trading performance for robustness and data efficiency.

This thesis focuses on methods, to carry over the benefits of big data models to small or medium scale industrial applications. We aim to increase the data efficiency and generalization abilities of deep learning models, while offering computationally efficient solutions that can be implemented in resource constrained environments. Here, we focus on two different aspects; (i) implementing specialized, task specific architectures that include prior knowledge about the data, and (ii) increasing generalization and robustness by mitigating domain shifts during the operational phase of the model. For the first aspect (i), we design architectures according to data priors, which helps to improve model generalization, and also reduces the amount of necessary training data. In particular, we investigate deep learning architectures, explicitly modelling different time-scales, feature hierarchies, or the complex-valued nature of the used data. Our results on various industry related tasks show, that using specialized architectures can substantially improve generalization and data efficiency, while keeping the models computationally parsimonious.

Aspect two (ii) tackles the problem of generalization from a different angle, explicitly mitigating distribution shifts using an unsupervised correction method. These shifts can occur if the model is presented data from unknown domains, which can be caused by sensor drifts, noise or coarsely sampled training data. In our experiments, we see that this kind of domain adaptation is highly effective across all investigated datasets.