Signal Processing and Speech Communication Laboratory
hometheses & projects › Machine Learning Based Speech (or Music) Separation

Machine Learning Based Speech (or Music) Separation

Status
Open
Type
Master Thesis
Announcement date
10 Oct 2022
Mentors
Research Areas

Short Description

Assume a single-channel multiple (two) speaker recording. Speech separation for such tasks can be formulated as classification or regression problem in the time-frequency domain. Recently, we used deep neural networks to accomplish this task. One task of this thesis is to extend the system by using recently developed neural network architectures such as DenseNets. The system is evaluated on available data using commonly used performance measures such as SIR, SDR, SAR, or PESQ. Similar approaches work for music signals.

Tasks

  • Extend the available prototype system.
  • Test the system on available data set.
  • Implement these models in Tensorflow
  • Empirical Verification of these algorithms

Your Profile/Requirements

The candidate should be interested in machine learning, speech processing, and neural networks. Excellent programming skills in C++, python etc. are required. Interested candidates are encouraged to ask for further information. Additionally, the supervision of own projects in one of the above mention fields is possible.

Contact

Franz Pernkopf (pernkopf@tugraz.at or 0316/873 4436)