Evaluation of speaker mimic technology for personalizing SGD voices

Esther Klabbers; Alexander Kain; Jan P.H. Van Santen

Evaluation of speaker mimic technology for personalizing SGD voices

Esther Klabbers, Alexander Kain, Jan P.H. Van Santen

Institute on Development and Disability

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

3 Scopus citations

Abstract

In this paper, we demonstrate the use of state-of-the-art speech technology to transform speech from a source speaker to mimic a particular target speaker with the intention of providng personalized voices to users of Speech Generating Devices (SGDs). This speaker mimicry (SM) capability allows us to use high-quality acoustic inventories from professional speakers and transform them to a different target speaker using a very limited set of sentences from that speaker. This technology targets future SGD users who still have a limited vocabulary or available previous recordings. The results of a perceptual study show that listeners can identify which SM voices most resemble their respective target voices.¹

Original language	English (US)
Title of host publication	Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
Publisher	International Speech Communication Association
Pages	2154-2157
Number of pages	4
State	Published - 2010

Publication series

Name	Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Keywords

Perceptual evaluation
Prosody modeling
Speech synthesis
Voice transformation

ASJC Scopus subject areas

Language and Linguistics
Speech and Hearing
Human-Computer Interaction
Signal Processing
Software
Modeling and Simulation

Cite this

Klabbers, E., Kain, A., & Van Santen, J. P. H. (2010). Evaluation of speaker mimic technology for personalizing SGD voices. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 (pp. 2154-2157). (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010). International Speech Communication Association.

Evaluation of speaker mimic technology for personalizing SGD voices. / Klabbers, Esther; Kain, Alexander; Van Santen, Jan P.H.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. International Speech Communication Association, 2010. p. 2154-2157 (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Klabbers, E, Kain, A & Van Santen, JPH 2010, Evaluation of speaker mimic technology for personalizing SGD voices. in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, International Speech Communication Association, pp. 2154-2157.

Klabbers E, Kain A, Van Santen JPH. Evaluation of speaker mimic technology for personalizing SGD voices. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. International Speech Communication Association. 2010. p. 2154-2157. (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010).

Klabbers, Esther ; Kain, Alexander ; Van Santen, Jan P.H. / Evaluation of speaker mimic technology for personalizing SGD voices. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. International Speech Communication Association, 2010. pp. 2154-2157 (Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010).

@inproceedings{42344b9ec6574f0796879c01f4cf98fe,

title = "Evaluation of speaker mimic technology for personalizing SGD voices",

abstract = "In this paper, we demonstrate the use of state-of-the-art speech technology to transform speech from a source speaker to mimic a particular target speaker with the intention of providng personalized voices to users of Speech Generating Devices (SGDs). This speaker mimicry (SM) capability allows us to use high-quality acoustic inventories from professional speakers and transform them to a different target speaker using a very limited set of sentences from that speaker. This technology targets future SGD users who still have a limited vocabulary or available previous recordings. The results of a perceptual study show that listeners can identify which SM voices most resemble their respective target voices.1",

keywords = "Perceptual evaluation, Prosody modeling, Speech synthesis, Voice transformation",

author = "Esther Klabbers and Alexander Kain and {Van Santen}, {Jan P.H.}",

note = "Funding Information: 1This project was funded by NIH STTR grant 1R41DC008712-01: User Adaptation of AAC Voices and the Nancy Lurie Marks foundation project “In Your Own Voice: Personal Augmentative and Alternative Communication Voices for Minimally Verbal Children with Autism Spectrum Disorders” and was a collaboration between Biospeech, Inc. and OHSU. OHSU and the authors have a significant financial interest in BioSpeech, Inc., a company that may have a commercial interest in the results of this research and technology. This potential conflict was reviewed and managed by OHSU and the Integrity Program Oversight Council. Funding Information: This project was funded by NIH STTR grant 1R41DC008712-01: User Adaptation of AAC Voices and the Nancy Lurie Marks foundation project “In Your Own Voice: Personal Augmentative and Alternative Communication Voices for Minimally Verbal Children with Autism Spectrum Disorders” and was a collaboration between Biospeech, Inc. and OHSU. OHSU and the authors have a significant financial interest in BioSpeech, Inc., a company that may have a commercial interest in the results of this research and technology. This potential conflict was reviewed and managed by OHSU and the Integrity Program Oversight Council.",

year = "2010",

language = "English (US)",

series = "Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010",

publisher = "International Speech Communication Association",

pages = "2154--2157",

booktitle = "Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010",

}

TY - GEN

T1 - Evaluation of speaker mimic technology for personalizing SGD voices

AU - Klabbers, Esther

AU - Kain, Alexander

AU - Van Santen, Jan P.H.

N1 - Funding Information: 1This project was funded by NIH STTR grant 1R41DC008712-01: User Adaptation of AAC Voices and the Nancy Lurie Marks foundation project “In Your Own Voice: Personal Augmentative and Alternative Communication Voices for Minimally Verbal Children with Autism Spectrum Disorders” and was a collaboration between Biospeech, Inc. and OHSU. OHSU and the authors have a significant financial interest in BioSpeech, Inc., a company that may have a commercial interest in the results of this research and technology. This potential conflict was reviewed and managed by OHSU and the Integrity Program Oversight Council. Funding Information: This project was funded by NIH STTR grant 1R41DC008712-01: User Adaptation of AAC Voices and the Nancy Lurie Marks foundation project “In Your Own Voice: Personal Augmentative and Alternative Communication Voices for Minimally Verbal Children with Autism Spectrum Disorders” and was a collaboration between Biospeech, Inc. and OHSU. OHSU and the authors have a significant financial interest in BioSpeech, Inc., a company that may have a commercial interest in the results of this research and technology. This potential conflict was reviewed and managed by OHSU and the Integrity Program Oversight Council.

PY - 2010

Y1 - 2010

N2 - In this paper, we demonstrate the use of state-of-the-art speech technology to transform speech from a source speaker to mimic a particular target speaker with the intention of providng personalized voices to users of Speech Generating Devices (SGDs). This speaker mimicry (SM) capability allows us to use high-quality acoustic inventories from professional speakers and transform them to a different target speaker using a very limited set of sentences from that speaker. This technology targets future SGD users who still have a limited vocabulary or available previous recordings. The results of a perceptual study show that listeners can identify which SM voices most resemble their respective target voices.1

AB - In this paper, we demonstrate the use of state-of-the-art speech technology to transform speech from a source speaker to mimic a particular target speaker with the intention of providng personalized voices to users of Speech Generating Devices (SGDs). This speaker mimicry (SM) capability allows us to use high-quality acoustic inventories from professional speakers and transform them to a different target speaker using a very limited set of sentences from that speaker. This technology targets future SGD users who still have a limited vocabulary or available previous recordings. The results of a perceptual study show that listeners can identify which SM voices most resemble their respective target voices.1

KW - Perceptual evaluation

KW - Prosody modeling

KW - Speech synthesis

KW - Voice transformation

UR - http://www.scopus.com/inward/record.url?scp=79959840813&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959840813&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:79959840813

T3 - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

SP - 2154

EP - 2157

BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

PB - International Speech Communication Association

ER -

Evaluation of speaker mimic technology for personalizing SGD voices

Abstract

Publication series

Keywords

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this