Signal Processing and Speech Communication Laboratory
homeresearch projects › SPARC - Semantic and Phonetic Automatic ReConstruction of Medical Dictations

SPARC - Semantic and Phonetic Automatic ReConstruction of Medical Dictations

2005 — 2006
FIT-IT Semantic Systems, FFG, Project Nr. FIT-IT-809 258
Research Areas

The SPARC (Semantic Phonetic Automatic ReConstruction) project aims at automatically reconstructing the original wording of a medical dictation from its formatted, corrected written form and the error-prone output of a speech recogniser. Normally, either of these two texts alone is not sufficient to obtain a literal transcription, since the written report may contain reformulations of the original utterance and the recogniser output misrecognitions. In the SPARC approach, the resources are now combined and a semantic and phonetic analysis is performed on the texts to resolve the mismatches between them. This way, the available large corpora of audio recordings of the dictations, draft recognitions, and corresponding final medical reports can be used for improving current text production systems using automatic speech recognition. Furthermore, SPARC is also supposed to give insights into the processes involved in manual transcription of dictations, which may allow further automation in large scale text production environments.

SPARC is a cooperation between Philips Speech Recognition Systems, the Austrian Research Institute for Artificial Intelligence (OFAI), and SPSC.

The task of SPSC in SPARC is to develop a phonetic similarity measure that allows the robust detection of corrections and reformulations on a phonetic level in a final medical report given the recognised text. Current similarity measures, however, do not take into account the special features of dictations like reductions or assimilations as they occur in fast speech. To model these phenomena appropriately, the new similarity measure will be trained on the available data with machine learning techniques. This way, its application to other languages and domains is also facilitated.