Synthetic F0 can effectively convey speaker ID in delexicalized speech

Eric Morley, Esther Klabbers, Jan Van Santen, Alexander Kain, Seyed Hamidreza Mohammadi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

We investigate the extent to which F0 can convey speaker ID in the absence of spectral, segmental, and durational information. We propose two methods of F0 synthesis based on the Linear Alignment Model (LAM) [2]: one parametric, the other corpus-based. Through a perceptual experiment, we show that F0 alone is able to convey information about speaker ID. We find that F0 synthesized with either LAM-based method conveys speaker ID almost as effectively as natural F0.

Original languageEnglish (US)
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages434-437
Number of pages4
StatePublished - 2012
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: Sep 9 2012Sep 13 2012

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume1

Other

Other13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Country/TerritoryUnited States
CityPortland, OR
Period9/9/129/13/12

Keywords

  • F
  • Prosody
  • Recombinant synthesis
  • Speaker identity
  • Speech synthesis

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

Fingerprint

Dive into the research topics of 'Synthetic F0 can effectively convey speaker ID in delexicalized speech'. Together they form a unique fingerprint.

Cite this