A Machine-Learning Approach to Recognition of Spoken German Variants

Project Type: Master/Diploma Thesis
Student: Dizarevic Vodran

 

 A system for the recognition of variants of German language is implemented in MATLAB. Two classes are defined: Austrian German and German German. The system should be able to automatically distinguish the class of a spontaneously spoken utterance in German. The system uses only supra-segmental, prosodic information for regional variant recognition. This process is structured in three steps which are reflected in three independent modules of the system. Firstly the fundamental frequency is calculated from the spoken utterance using the YIN pitch extraction algorithm. Then the fundamental frequency contour is parameterized by calculating 28 features for an utterance. The features are divided into three main categories, the Fujisaki parameters, mean log intervals and percentiles. A large database for Austrian German and German German, containing 1000 speakers per class, is used to obtain a large set of training and test parameter sets. The best cross-validated recognition rates are 69% which is no improvement compared to earlier work by Hagmueller who, however, used only 10 speakers for training and testing. These results are obtained using neural networks for classification with a set of 18 selected features.