Signal Processing and Speech Communication Laboratory

GRASS – Orthographic Transcription and Forced Alignment

This article is part of GRASS: the Graz corpus of Read And Spontaneous Speech.

Transcription Protocol

  • PRAAT, separate tiers, short chunks of max. 6s
  • hesitations, repetitions and disfluencies
  • laughter, breathing, smacking, singing, etc.
  • foreign and dialect words
  • manner of articulation, e.g., whispered, laughed speech
  • overlapping speech

Here you can find a complete set of symbols used for the creation of the orthographic transcriptions.


Transcription Procedure

  1. 6 transcribers participated training workshop
  2. Then, they transcribed one conversation
  3. Second workshop: mutual correction of transcription
  4. Transcription of other conversations
  5. Correction by 1 transcriber other than who made the first transcription

During the whole transcription process, the transcribers continued to add content to a transcription protocol and to a lexicon (for the spelling of non-standard words, particles and non-lexical items), which they (online) shared amongst them.

Forced alignments were done with Kaldi, fine-tuned on the corpus data. The alignments are annotated on three levels: the utterance level, the word level, and the phone level (in SAMPA for German).


Further Reading on GRASS