Jointly Model Source Separation and Speech Recognition

home › theses & projects › Jointly Model Source Separation and Speech Recognition

Jointly Model Source Separation and Speech Recognition

Status

Open

Type

Master Thesis

Announcement date

05 Oct 2015

Mentors

Martin Ratajczak

Research Areas

Intelligent Systems

Short Description

Speech recognition under realistic conditions is still an unsolved problem after decades of research. But the smartphone market, for example, demands working solutions with small resource footprints. Usually, phone recognition and source separation models (e.g. separation of speech and noise signal) are trained independentlly and applied in sequence. This work should bring both aspects together either by phone-aware source separation or joint training of a source separation and phone recognition model. Recently, the source separation problem has been formulated as structure prediction problem (like classification but on sequences) [1] using Linear-chain Conditional Random Fields (LC-CRFs)[2]. LC-CRFs have been extended at our lab. Your task will be to use or to extend these models.

Your Tasks

Preparation of a data set in Matlab
Implement or extend these models in Java (there is an existing implementation)
Analyze the implemented systems in terms of accuracy and computational performance

Your Profile

Very good theoretical and mathmatical background (mandatory)
Good knowledge in machine learning
Very good knowledge and experience in Java programming (mandatory)

Additional Information

As this work combines theoretical and experimental aspects of non-standard methods, a very good mathmatical and programming background is mandatory. This thesis project is planned for a duration of 6 months starting immediately. It has a good chance for publications.

Contact

Martin Ratajczak (martin.ratajczak@tugraz.at or +43 (316) 873 - 4379)

References

[1] Y. Wang and D. Wang, “Cocktail party processing via structured prediction,” in Advances in Neural Information Processing Systems (NIPS), 2012, pp. 224–232.

[2] J. Lafferty, A. McCallum, and F. Pereira, “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” in International Conference on Machine Learning (ICML), 2001, pp. 282–289.