Acoustic Source Localization with a Circular Microphone Array

Project Type: Master/Diploma Thesis
Student: Ottowitz Lukas
Mentor: Gernot Kubin


 Source localization using microphone arrays has become a common approach in a variety of speech applications such as teleconferencing, speaker tracking, voice capturing and many other. This thesis focuses on two state-of-the-art source localization strategies (Pairwise Time Delay Estimation and Steered Beamforming) and their performance when faced with different speech scenarios and environments. Performance comparisons using a 16-channel circular microphone array show that the steered beamforming technique SRP-Phat outperforms the TDE-based GCC-Phat in terms of robustness and estimation accuracy in noisy and reverberant environments. Furthermore, this thesis investigates the recently proposed joint position and pitch estimation method known as PoPi. Obviously, speaker tracking applications can benefit from the additional feature pitch, which enables one to distinguish between multiple speakers in an acoustic scene. Performance comparisons show that the PoPi method is less sensitive to noise sources and therefore highly suitable for typical speech environments. In order to improve the performance of the PoPi method vis-a-vis multi-speaker scenarios in realistic environments, two modifications (PoPi-Phat and PoPi-filter) are proposed in this work. The proposed PoPi-filter method, which uses a preprocessing module, shows very promising estimation results in terms of robustness and multi-speaker performance.