Research in Language Technologies at our lab has two foci. The first one is mainly motivated by challenges in automatic speech recognition back-end processing such as language modelling, modelling variation in pronunciation, or text alignment. Our text analysis methods comprise customizable phonetic and semantic similarity measures which have been evaluated on large industrial and scientific text collections with respect to large vocabulary continuous speech recognition (dictation). We investigated models for dialect transformations at both, the sentence-level using grammatical transformations and the word-level with pronunciation transformations in close cooperation with our speech synthesis efforts. Since only recently, we have been investigating language modelling methods for conversational speech, a speaking style wich comes with the additional challenge of large intra- and interspeaker variation and only small amounts of data available.
The second focus within language technologies is motivated by challenges in the field of digital humanities. For instance, we contribute with methods from graph theory, network analysis and sentiment analysis to the analysis of theatre plays and audio books, as well as to the field of gender studies.