Localization and Tracking of Speech Sources using Circular Microphone Array
- Status
- Student
- Tania Habib
- Mentor
- Harald Romsdorfer
- Research Areas
In today’s world, hands-free communication has become an essential part of day to day activities. It exists as an acoustic front end of telephony and speech dialog systems to name the few. In practice, these systems are placed in adverse acoustic environments with ambient noise, moreover the distance between the speaker and microphones decreases the level of desired speech signal resulting in poor quality signal acquisition. The emergence of array signal processing techniques are offering improved system performance and the ability to design improved multiple input systems for higher quality. It allows to solve problems such as source localization and tracking, which are not possible with single-channel systems. Moreover, accurate detection, localization and tracking of speakers is essential for media processing such as: steering a video camera towards an active speaker, for speech enhancement of the active stream using microphone array beamforming to be used for speech recognition and to provide accumulated information for speaker identification.
This thesis deals with localization and tracking tasks in meeting room scenarios equipped with multi-sensors for recordings. A uniform circular microphone array is used to record various events taking place in such environment, e.g., spontaneous multi-party speech, speaker turns including short utterances, non-linear human motion, and multiple overlapping speakers and/or background noise sources. Different techniques originating from computational auditory scene analysis and statistical models have been investigated and combined to develop algorithms that can localize and track active speech sources in such scenarios.