Hybridizing conversational and clear speech to determine the degree of contribution of acoustic features to intelligibility

Alexander Kain, Akiko Amano-Kusumoto, John Paul Hosom

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

Speakers naturally adopt a special "clear" (CLR) speaking style in order to be better understood by listeners who are moderately impaired in their ability to understand speech due to a hearing impairment, the presence of background noise, or both. In contrast, speech intended for nonimpaired listeners in quiet environments is referred to as "conversational" (CNV). Studies have shown that the intelligibility of CLR speech is usually higher than that of CNV speech in adverse circumstances. It is not known which individual acoustic features or combinations of features cause the higher intelligibility of CLR speech. The objective of this study is to determine the contribution of some acoustic features to intelligibility for a single speaker. The proposed method creates "hybrid" (HYB) speech stimuli that selectively combine acoustic features of one sentence spoken in the CNV and CLR styles. The intelligibility of these stimuli is then measured in perceptual tests, using 96 phonetically balanced sentences. Results for one speaker show significant sentence-level intelligibility improvements over CNV speech when replacing certain combinations of short-term spectra, phoneme identities, and phoneme durations of CNV speech with those from CLR speech, but no improvements for combinations involving fundamental frequency, energy, or nonspeech events (pauses).

Original languageEnglish (US)
Pages (from-to)2308-2319
Number of pages12
JournalJournal of the Acoustical Society of America
Volume124
Issue number4
DOIs
StatePublished - 2008

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Hybridizing conversational and clear speech to determine the degree of contribution of acoustic features to intelligibility'. Together they form a unique fingerprint.

Cite this