GRASS - Symbols for Orthographic Transcriptions
This article is part of GRASS: the Graz corpus of Read And Spontaneous Speech.
Symbols used for the orthographic transcriptions and their assigned lexica:
- ADABA
- Lexicon of Austrian German
- ERG
- Lexicon with additional German words
- DIAL
- Lexicon with dialect words
- ForeignWords
- foreign words
- BrokenWords
- broken words
- SpellingAlphabet
- spelled letters
- MultiWordExpressions
- multi-word expressions
Lexical Item | Example | Lexicon |
---|---|---|
Standard Austrian German words | ich gehe von zu Hause weg | ERG |
Dialect words | <*DIAL>Kretzn | DIAL |
High frequent multi-word expressions with special pronunciation | wenn_du | MultiWordExpressions |
Spelling of letters | $G $K $K | SpellingAlphabet |
Proper names of people, places, etc. | Sankt Michael | ERG |
Numbers not written with digits | #einhundertdreizehn | ERG |
Neologisms, invented by the speaker | Genussvermeider | ERG |
Foreign words | ForeignWords | |
Hesitations and disfluencies | Example | Lexicon |
Repetition: word (group) produced more than once | und dann (+ hat + hat +) er | |
(+ und dann + und dann +) hat er | ||
Slip of the tongue | <&s>kervehrt | |
Misbuilt grammar | du <&m>kriegt | |
Broken word | <&b>gebra | BrokenWords |
Other types of speech and non-speech | Example | |
Imitation of accent or other person | <&i>und <&i>was <&i>hast <&i>du | |
Onomatopoeia | <&o>tschu <&o>tschu | |
Whispered words | er hat eh <&w>schon <&w>wissen | |
Non-speech produced by the speakers’ vocal folds | <laughter>, <singing> | |
<sigh>, <cough>, <smack> | ||
<breathingIN>, <breathingOUT> | ||
Laughed words | <&L>und <&L>dann <&L>hat <&L>er | |
Non-speech other than mentioned above | <noise> | |
Overlapping speech of two speakers | [ ja, hm, ja das ] | |
Artifacts in the recordings | <#artefact> | |
Other noises not covered with mentioned symbols | <#noise> |