GRASS: the Graz corpus of Read And Spontaneous Speech

We present the first large scale speech database for Austrian German:

  • 38 speakers , male and female, different social and regional backgrounds
  • read speech
    2 744 utterances, 19 510 words
  • read and elicited commands
    1 710 utterances, 3 853 words
  • spontaneous conversations
    48 960 utterances, 276 000 words

GRASS is designed for linguistic & phonetic studies and for the development of an ASR System:

  • high-quality super-wideband recordings
    > simulation of different acoustic environments
  • detailed orthographic transcriptions
    > further (semi-)automatic annotation layers
  • sufficient read speech and commands
    > for ASR and dialogue system
  • sufficient spontaneous speech
    > pronunciation modeling for ASR
SPSC Cross Reference
Research Area: 
Acronym: 
GRASS
Speech
Language: 
German
Native: 
Native
Gender: 
Female
Male
Content Type: 
Read Speech
Spontaneous Speech
Conversational Speech
Segmentation Level: 
Word
Utterance
Segmentation Method: 
Automatic
Manual
Laryngograph: 
Yes
Number of speakers: 
38
Sampling Frequency: 
48kHz
Multi-channel: 
Yes
Number of channels: 
5
Condition: 
Clean
Type: 
Database