Speaker Recognition with deep neural networks

home › theses & projects › Speaker Recognition with deep neural networks

Speaker Recognition with deep neural networks

Status

Open

Type

Master Thesis

Announcement date

11 Mar 2015

Mentors

Matthias Zöhrer

Research Areas

Signal Processing and Speech Communication Laboratory

Diploma/Master’s Thesis: Speaker Recognition with deep neural networks

Short Description :

You will extend a powerful neural network to solve speaker recogntion tasks. Therefore you will extend the model to learn the time dynamics of the underlying data. The model will be able classify speech by using a sequence 2 sequence learning [1] approach. You will learn how to implement fast and reliable neural network models on a GPU and will get a broad knowledge of machine learning. If possible the outcome of your work will contribute to a machine learning paper. If your are interested in this fascinating field of science
simply drop me an email.

Your Tasks :

extend a neural network model in python on the GPU using THEANO [2]
analyze the implemented systems in terms of accuracy and performance
contribute to scientific work in form of a paper

Your Outcome :

learn to implement and simulate very fast Neural Networks on a GPU
learn how to solve difficult object recognition tasks
get a broad education in on applied machine learning

Your Profile :

motivation and reliability are a prerequisite
good knowledge in machine learning and neural networks (at least >2 machine learning courses)
knowledge in python programming

Additional Information :

This thesis project is planned for a duration of 6 months starting immediately. As it is a valid
contribution to an ongoing research project at the SPSC, it is rewarded with 2640e (440e per
month) and a good chance for publications.

Contact :

Matthias Zoehrer (matthias.zoehrer@tugraz.at or +43 (316) 873 - 4385)

References

[1] I. Sutskever, O. Vinyals, Q. V. Le. Sequence to Sequence Learning with Neural Networks, 2014

[2] J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu, G. Desjardins, J. Turian, D.Warde-
Farley, and Y. Bengio, "Theano: a CPU and GPU math expression compiler," in Proceedings of
the Python for Scientific Computing Conference (SciPy), Jun. 2010, oral Presentation.

Signal Processing and Speech Communication Laboratory (SPSC),

Graz University of Technology, Inffeldgasse 16c, 8010 Graz, Austria, http://www.spsc.tugraz.at
created March 10, 2014