Speech repairs, intonational phrases, and discourse markers: Modeling speakers' utterances in spoken dialogue

Peter Heeman, James F. Allen

Research output: Contribution to journalArticle

82 Citations (Scopus)

Abstract

Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker's intended utterances: both segmenting a speaker's turn into utterances and determining the intended words in each utterance. Even assuming perfect word recognition, the latter problem is complicated by the occurrence of speech repairs, which occur where speakers go back and change (or repeat) something they just said. The words that are replaced or repeated are no longer part of the intended utterance, and so need to be identified. Segmenting turns and resolving repairs are strongly intertwined with a third task: identifying discourse markers. Because of the interactions, and interactions with POS tagging and speech recognition, we need to address these tasks together and early on in the processing stream. This paper presents a statistical language model in which we redefine the speech recognition problem so that it includes the identification of POS tags, discourse markers, speech repairs, and intonational phrases. By solving these simultaneously, we obtain better results on each task than addressing them separately. Our model is able to identify 72% of turn-internal intonational boundaries with a precision of 71%, 97% of discourse markers with 96% precision, and detect and correct 66% of repairs with 74% precision.

Original languageEnglish (US)
Pages (from-to)527-571
Number of pages45
JournalComputational Linguistics
Volume25
Issue number4
StatePublished - Dec 1999

Fingerprint

Repair
dialogue
discourse
Speech recognition
interaction
language
Intonational Phrase
Utterance
Modeling
Discourse Markers
Processing
Speech Recognition
Interaction

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Linguistics and Language

Cite this

Speech repairs, intonational phrases, and discourse markers : Modeling speakers' utterances in spoken dialogue. / Heeman, Peter; Allen, James F.

In: Computational Linguistics, Vol. 25, No. 4, 12.1999, p. 527-571.

Research output: Contribution to journalArticle

@article{bdf92aa745e249739266fe96fa8dbb00,
title = "Speech repairs, intonational phrases, and discourse markers: Modeling speakers' utterances in spoken dialogue",
abstract = "Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker's intended utterances: both segmenting a speaker's turn into utterances and determining the intended words in each utterance. Even assuming perfect word recognition, the latter problem is complicated by the occurrence of speech repairs, which occur where speakers go back and change (or repeat) something they just said. The words that are replaced or repeated are no longer part of the intended utterance, and so need to be identified. Segmenting turns and resolving repairs are strongly intertwined with a third task: identifying discourse markers. Because of the interactions, and interactions with POS tagging and speech recognition, we need to address these tasks together and early on in the processing stream. This paper presents a statistical language model in which we redefine the speech recognition problem so that it includes the identification of POS tags, discourse markers, speech repairs, and intonational phrases. By solving these simultaneously, we obtain better results on each task than addressing them separately. Our model is able to identify 72{\%} of turn-internal intonational boundaries with a precision of 71{\%}, 97{\%} of discourse markers with 96{\%} precision, and detect and correct 66{\%} of repairs with 74{\%} precision.",
author = "Peter Heeman and Allen, {James F.}",
year = "1999",
month = "12",
language = "English (US)",
volume = "25",
pages = "527--571",
journal = "Computational Linguistics",
issn = "0891-2017",
publisher = "MIT Press Journals",
number = "4",

}

TY - JOUR

T1 - Speech repairs, intonational phrases, and discourse markers

T2 - Modeling speakers' utterances in spoken dialogue

AU - Heeman, Peter

AU - Allen, James F.

PY - 1999/12

Y1 - 1999/12

N2 - Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker's intended utterances: both segmenting a speaker's turn into utterances and determining the intended words in each utterance. Even assuming perfect word recognition, the latter problem is complicated by the occurrence of speech repairs, which occur where speakers go back and change (or repeat) something they just said. The words that are replaced or repeated are no longer part of the intended utterance, and so need to be identified. Segmenting turns and resolving repairs are strongly intertwined with a third task: identifying discourse markers. Because of the interactions, and interactions with POS tagging and speech recognition, we need to address these tasks together and early on in the processing stream. This paper presents a statistical language model in which we redefine the speech recognition problem so that it includes the identification of POS tags, discourse markers, speech repairs, and intonational phrases. By solving these simultaneously, we obtain better results on each task than addressing them separately. Our model is able to identify 72% of turn-internal intonational boundaries with a precision of 71%, 97% of discourse markers with 96% precision, and detect and correct 66% of repairs with 74% precision.

AB - Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker's intended utterances: both segmenting a speaker's turn into utterances and determining the intended words in each utterance. Even assuming perfect word recognition, the latter problem is complicated by the occurrence of speech repairs, which occur where speakers go back and change (or repeat) something they just said. The words that are replaced or repeated are no longer part of the intended utterance, and so need to be identified. Segmenting turns and resolving repairs are strongly intertwined with a third task: identifying discourse markers. Because of the interactions, and interactions with POS tagging and speech recognition, we need to address these tasks together and early on in the processing stream. This paper presents a statistical language model in which we redefine the speech recognition problem so that it includes the identification of POS tags, discourse markers, speech repairs, and intonational phrases. By solving these simultaneously, we obtain better results on each task than addressing them separately. Our model is able to identify 72% of turn-internal intonational boundaries with a precision of 71%, 97% of discourse markers with 96% precision, and detect and correct 66% of repairs with 74% precision.

UR - http://www.scopus.com/inward/record.url?scp=0040958578&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0040958578&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:0040958578

VL - 25

SP - 527

EP - 571

JO - Computational Linguistics

JF - Computational Linguistics

SN - 0891-2017

IS - 4

ER -