Determining the relevance of different aspects of formant contours to intelligibility

Akiko Amano-Kusumoto, John Paul Hosom, Alexander Kain, Justin M. Aronoff

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Previous studies have shown that "clear" speech, where the speaker intentionally tries to enunciate, has better intelligibility than "conversational" speech, which is produced in regular conversation. However, conversational and clear speech vary along a number of acoustic dimensions and it is unclear what aspects of clear speech lead to better intelligibility. Previously, Kain et al. (2008) showed that a combination of short-term spectra and duration was responsible for the improved intelligibility of one speaker. This study investigates subsets of specific features of short-term spectra including temporal aspects. Similar to Kain's study, hybrid stimuli were synthesized with a combination of features from clear speech and complementary features from conversational speech to determine which acoustic features cause the improved intelligibility of clear speech. Our results indicate that, although steady-state formant values of tense vowels contributed to the intelligibility of clear speech, neither the steady-state portion nor the formant transition was sufficient to yield comparable intelligibility to that of clear speech. In contrast, when the entire formant contour of conversational speech including the phoneme duration was replaced by that of clear speech, intelligibility was comparable to that of clear speech. It indicated that the combination of formant contour and duration information was relevant to the improved intelligibility of clear speech. The study provides a better understanding of the relevance of different aspects of formant contours to the improved intelligibility of clear speech.

Original languageEnglish (US)
Pages (from-to)1-9
Number of pages9
JournalSpeech Communication
Volume59
DOIs
StatePublished - Apr 2014

Fingerprint

Speech intelligibility
Speech
Relevance
Formants
Intelligibility
acoustics
Acoustics
Speech Intelligibility

Keywords

  • Speech intelligibility
  • Speech synthesis
  • Vowel perception

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Communication
  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications
  • Modeling and Simulation

Cite this

Determining the relevance of different aspects of formant contours to intelligibility. / Amano-Kusumoto, Akiko; Hosom, John Paul; Kain, Alexander; Aronoff, Justin M.

In: Speech Communication, Vol. 59, 04.2014, p. 1-9.

Research output: Contribution to journalArticle

Amano-Kusumoto, Akiko ; Hosom, John Paul ; Kain, Alexander ; Aronoff, Justin M. / Determining the relevance of different aspects of formant contours to intelligibility. In: Speech Communication. 2014 ; Vol. 59. pp. 1-9.
@article{5f760d38546f4c00a9f8a23c3148ed99,
title = "Determining the relevance of different aspects of formant contours to intelligibility",
abstract = "Previous studies have shown that {"}clear{"} speech, where the speaker intentionally tries to enunciate, has better intelligibility than {"}conversational{"} speech, which is produced in regular conversation. However, conversational and clear speech vary along a number of acoustic dimensions and it is unclear what aspects of clear speech lead to better intelligibility. Previously, Kain et al. (2008) showed that a combination of short-term spectra and duration was responsible for the improved intelligibility of one speaker. This study investigates subsets of specific features of short-term spectra including temporal aspects. Similar to Kain's study, hybrid stimuli were synthesized with a combination of features from clear speech and complementary features from conversational speech to determine which acoustic features cause the improved intelligibility of clear speech. Our results indicate that, although steady-state formant values of tense vowels contributed to the intelligibility of clear speech, neither the steady-state portion nor the formant transition was sufficient to yield comparable intelligibility to that of clear speech. In contrast, when the entire formant contour of conversational speech including the phoneme duration was replaced by that of clear speech, intelligibility was comparable to that of clear speech. It indicated that the combination of formant contour and duration information was relevant to the improved intelligibility of clear speech. The study provides a better understanding of the relevance of different aspects of formant contours to the improved intelligibility of clear speech.",
keywords = "Speech intelligibility, Speech synthesis, Vowel perception",
author = "Akiko Amano-Kusumoto and Hosom, {John Paul} and Alexander Kain and Aronoff, {Justin M.}",
year = "2014",
month = "4",
doi = "10.1016/j.specom.2013.12.001",
language = "English (US)",
volume = "59",
pages = "1--9",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",

}

TY - JOUR

T1 - Determining the relevance of different aspects of formant contours to intelligibility

AU - Amano-Kusumoto, Akiko

AU - Hosom, John Paul

AU - Kain, Alexander

AU - Aronoff, Justin M.

PY - 2014/4

Y1 - 2014/4

N2 - Previous studies have shown that "clear" speech, where the speaker intentionally tries to enunciate, has better intelligibility than "conversational" speech, which is produced in regular conversation. However, conversational and clear speech vary along a number of acoustic dimensions and it is unclear what aspects of clear speech lead to better intelligibility. Previously, Kain et al. (2008) showed that a combination of short-term spectra and duration was responsible for the improved intelligibility of one speaker. This study investigates subsets of specific features of short-term spectra including temporal aspects. Similar to Kain's study, hybrid stimuli were synthesized with a combination of features from clear speech and complementary features from conversational speech to determine which acoustic features cause the improved intelligibility of clear speech. Our results indicate that, although steady-state formant values of tense vowels contributed to the intelligibility of clear speech, neither the steady-state portion nor the formant transition was sufficient to yield comparable intelligibility to that of clear speech. In contrast, when the entire formant contour of conversational speech including the phoneme duration was replaced by that of clear speech, intelligibility was comparable to that of clear speech. It indicated that the combination of formant contour and duration information was relevant to the improved intelligibility of clear speech. The study provides a better understanding of the relevance of different aspects of formant contours to the improved intelligibility of clear speech.

AB - Previous studies have shown that "clear" speech, where the speaker intentionally tries to enunciate, has better intelligibility than "conversational" speech, which is produced in regular conversation. However, conversational and clear speech vary along a number of acoustic dimensions and it is unclear what aspects of clear speech lead to better intelligibility. Previously, Kain et al. (2008) showed that a combination of short-term spectra and duration was responsible for the improved intelligibility of one speaker. This study investigates subsets of specific features of short-term spectra including temporal aspects. Similar to Kain's study, hybrid stimuli were synthesized with a combination of features from clear speech and complementary features from conversational speech to determine which acoustic features cause the improved intelligibility of clear speech. Our results indicate that, although steady-state formant values of tense vowels contributed to the intelligibility of clear speech, neither the steady-state portion nor the formant transition was sufficient to yield comparable intelligibility to that of clear speech. In contrast, when the entire formant contour of conversational speech including the phoneme duration was replaced by that of clear speech, intelligibility was comparable to that of clear speech. It indicated that the combination of formant contour and duration information was relevant to the improved intelligibility of clear speech. The study provides a better understanding of the relevance of different aspects of formant contours to the improved intelligibility of clear speech.

KW - Speech intelligibility

KW - Speech synthesis

KW - Vowel perception

UR - http://www.scopus.com/inward/record.url?scp=84892694396&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84892694396&partnerID=8YFLogxK

U2 - 10.1016/j.specom.2013.12.001

DO - 10.1016/j.specom.2013.12.001

M3 - Article

VL - 59

SP - 1

EP - 9

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

ER -