Implementation of an Acoustic Joint Direction and Fundamental Frequency Estimator Based on Nonlinear Least Squares Methods

Project Type: Student Project
Student: Mattia Gabbrielli

jointdetection.png

Description

A distant speech recognition system is fundamental for voice-operated ambient assisted living facilities. A common purpose of such a system is to capture the wave field, locate a target, e.g., a speaker in a crowded room, set a focus (a virtual beam) on it, and enhance its speech. The same applies to close-talking systems, e.g., a speaker using a hand-held device in a crowded public transport vehicle.

In both cases, the system consists of several modules, whereas one module is related to locating targets acoustically. In addition to estimating the location of a target, we would like to estimate other characteristics of the target jointly and simultaneously, e.g., a speaker’s fundamental frequency during voiced utterances. This enables us to to estimate the location more precisely and to separate targets in a multi-target environment more accurately. Moreover, we can feed the fundamental frequency into, e.g., a word recognizer to increase the word accuracy rate.

We are currently working on distant- and close-talking speech recognition applications, where joint estimation of a target’s position and its fundamental frequency is an indispensable tool. By now, we have introduced several algorithms to jointly locate a target and estimate its fundamental frequency. We need to implement this algorithm mentioned in the headline to do more comprehensive performance evaluations and to be able to apply more suitable algorithms to specific problems.

Tasks

  • Literature review of the joint direction and fundamental frequency estimator.
  • Implementation in MATLAB.
  • Evaluation of its performance (speech database and speech recognizer provided).
  • Report about estimator, experiments, and findings (in English).

Your Profile / Requirements

The candidate should be interested in literature reviews (papers), spatial filtering (acoustic beamforming and target localization), digital non-/linear and statistical signal processing, and MATLAB programming.

References 

Jensen, J. R., Christensen, M. G., Jensen, S. H.„ “Nonlinear Least Squares Methods for Joint DOA and Pitch Estimation,” IEEE Transactions on Audio, Speech, and Language Processing, 21(5):923-933, May, 2013.

Info

If you would like to write a Master thesis about this topic, please contact Hannes Pessentheiner.

AttachmentSize
jointdetection.png63.47 KB