Signal Processing and Speech Communication Laboratory
homeresults of the month › Distribution and Timing of Verbal Backchannels in Conversational Speech: A Quantitative Study

Distribution and Timing of Verbal Backchannels in Conversational Speech: A Quantitative Study

Published
Mon, Sep 01, 2025
Tags
rotm
Contact
rotm 09 2025

Human communication is a remarkably coordinated activity. Successful interaction not only relies on the words themselves, but also on how these words are said, on subtle cues and fine-grained timing between conversational partners. Speakers continuously adjust to each other in real time, relying not only on linguistic content but also on prosody, gestures and context. This dynamic behavior is especially evident in spontaneous conversation, where speakers rarely plan their turns in advance but instead co-construct utterances on-the-fly and in conjunction with their interlocutors. Understanding how this coordination unfolds, particularly in the area of turn-taking, remains a central challenge for speech scientists and technologists aiming to understand and model human speech processing in spontaneous human conversation.

This paper explores backchannels, short listener responses such as “mhm”, which play an important role in managing turn-taking and grounding in spontaneous conversation. The study investigates if and when backchannels occur by taking into account the prosodic characteristics together with the communicative functions of the interlocutor’s speech preceding backchannels. Using a corpus of spontaneous dyadic conversations in Austrian German annotated with continuous turn-taking labels, we analyze the distribution of backchannels across different turn-taking contexts and examine which acoustic features affect their occurrence and timing. Our findings show that the turn-taking function of the interlocutor’s utterance is a significant predictor of whether a backchannel occurs or not: Backchannels tend to occur most frequently after longer and syntactically complete utterances by the interlocutor. Moreover, prosodic features such as utterance duration, articulation rate variability and rising or falling intensity affect the timing of listener responses, with significant differences across different turn-taking functions. These results highlight the value of using continuous turn-taking annotations to investigate conversational dynamics and demonstrate how turn-taking function and prosody jointly shape backchannel behavior in spontaneous conversation.

The paper was published in the Special Issue Current Trends in Discourse Marker Research of the MDPI Languages Journal. If you are interested have a look here.

Browse the Results of the Month archive.