GRASS - Symbols for Orthographic Transcriptions
This article is part of GRASS: the Graz corpus of Read And Spontaneous Speech.
Symbols used for the orthographic transcriptions and their assigned lexica:
- ADABA
- Lexicon of Austrian German
- ERG
- Lexicon with additional German words
- DIAL
- Lexicon with dialect words
- PART
- List of small particles
- FSP
- foreign words
- MWEX
- multi-word expressions
Lexical Item | Example | Lexicon |
---|---|---|
Standard Austrian German words | ich gehe von zu Hause weg | ERG |
Dialect words | < * DIAL > Kretzn | DIAL |
High frequent multi-word expressions | ja geh bitte | MWEX |
Spelling of letters | $G $K $K | – |
Abbreviations, letters not spoken separately | UNI | ERG |
Proper names of people, places, etc. | Sankt Michael | ERG |
Numbers not written with digits | #einhundertdreizehn | ERG |
Neologisms, invented by the speaker | Genussvermeider | ERG |
Foreign words | FSP | |
Hesitations and disfluencies | Example | Lexicon |
Repetition: word (group) produced more than once | und dann hat \+ hat \+ er | |
+ \und dann \+ + \und dann \+ hat er | ||
Slip of the tongue | kervehrt\v | PART |
Broken word | gebra\ | PART |
Other types of speech and non-speech | Example | |
Imitation of accent or other person | und\i was\i hast\i du\i | |
Imitation of an animal, vehicle, etc. | tschu \L tschu \L | PART |
Whispering of an utterance | er hat eh \F schon \F wissen \F | |
Non-speech produced by the speakers’ vocal folds | <laughter>, <singing> | |
<sigh>, <cough>, <smack> | ||
<breathingIN>, <breathingOUT> | ||
Non-speech noise while producing a word | <laughter>und <laughter>dann hat er | |
Non-speech other than mentioned above | <noise> | |
Overlapping speech of two speakers | \\ja, hm, ja das \\ | |
\\<laughter>\\ | ||
Artifacts in the recordings | <# artefact> | |
Other noises not covered with mentioned symbols | <# noise> |