Transmutative voice conversion

Seyed Hamidreza Mohammadi; Alexander Kain

doi:10.1109/ICASSP.2013.6639003

Transmutative voice conversion

Seyed Hamidreza Mohammadi, Alexander Kain

Institute on Development and Disability

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

10 Scopus citations

Abstract

There are two types of voice conversion (VC) systems: generative and transmutative. A generative VC system typically uses a compact parametrization of speech and maps input to output parameters directly; however, the relative low dimensionality of the underlying speech model reduces quality. On the other hand, a transmutative VC system modifies high-dimensional features of a high-fidelity speech model, leaving critical details unmodified. Two versions of transmutative VC approach are implemented and compared to a generative VC approach. The results show that the implemented transmutative VC is significantly better compared to generative VC in terms of quality. The difference between the two VC methods regarding recognition scores are insignificant.

Original language	English (US)
Title of host publication	2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
Pages	6920-6924
Number of pages	5
DOIs	https://doi.org/10.1109/ICASSP.2013.6639003
State	Published - Oct 18 2013
Event	2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC, Canada Duration: May 26 2013 → May 31 2013

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)	1520-6149

Other

Other	2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Country/Territory	Canada
City	Vancouver, BC
Period	5/26/13 → 5/31/13

Keywords

frequency warping
speech transformation
voice conversion

ASJC Scopus subject areas

Software
Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/ICASSP.2013.6639003

Cite this

Transmutative voice conversion. / Mohammadi, Seyed Hamidreza; Kain, Alexander.
2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings. 2013. p. 6920-6924 6639003 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Mohammadi, SH & Kain, A 2013, Transmutative voice conversion. in 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings., 6639003, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 6920-6924, 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, 5/26/13. https://doi.org/10.1109/ICASSP.2013.6639003

@inproceedings{e3da3f70813747cd802ea1b0abeb77f3,

title = "Transmutative voice conversion",

abstract = "There are two types of voice conversion (VC) systems: generative and transmutative. A generative VC system typically uses a compact parametrization of speech and maps input to output parameters directly; however, the relative low dimensionality of the underlying speech model reduces quality. On the other hand, a transmutative VC system modifies high-dimensional features of a high-fidelity speech model, leaving critical details unmodified. Two versions of transmutative VC approach are implemented and compared to a generative VC approach. The results show that the implemented transmutative VC is significantly better compared to generative VC in terms of quality. The difference between the two VC methods regarding recognition scores are insignificant.",

keywords = "frequency warping, speech transformation, voice conversion",

author = "Mohammadi, {Seyed Hamidreza} and Alexander Kain",

year = "2013",

month = oct,

day = "18",

doi = "10.1109/ICASSP.2013.6639003",

language = "English (US)",

isbn = "9781479903566",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

pages = "6920--6924",

booktitle = "2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings",

note = "2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 ; Conference date: 26-05-2013 Through 31-05-2013",

}

TY - GEN

T1 - Transmutative voice conversion

AU - Mohammadi, Seyed Hamidreza

AU - Kain, Alexander

PY - 2013/10/18

Y1 - 2013/10/18

N2 - There are two types of voice conversion (VC) systems: generative and transmutative. A generative VC system typically uses a compact parametrization of speech and maps input to output parameters directly; however, the relative low dimensionality of the underlying speech model reduces quality. On the other hand, a transmutative VC system modifies high-dimensional features of a high-fidelity speech model, leaving critical details unmodified. Two versions of transmutative VC approach are implemented and compared to a generative VC approach. The results show that the implemented transmutative VC is significantly better compared to generative VC in terms of quality. The difference between the two VC methods regarding recognition scores are insignificant.

AB - There are two types of voice conversion (VC) systems: generative and transmutative. A generative VC system typically uses a compact parametrization of speech and maps input to output parameters directly; however, the relative low dimensionality of the underlying speech model reduces quality. On the other hand, a transmutative VC system modifies high-dimensional features of a high-fidelity speech model, leaving critical details unmodified. Two versions of transmutative VC approach are implemented and compared to a generative VC approach. The results show that the implemented transmutative VC is significantly better compared to generative VC in terms of quality. The difference between the two VC methods regarding recognition scores are insignificant.

KW - frequency warping

KW - speech transformation

KW - voice conversion

UR - http://www.scopus.com/inward/record.url?scp=84890475857&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890475857&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2013.6639003

DO - 10.1109/ICASSP.2013.6639003

M3 - Conference contribution

AN - SCOPUS:84890475857

SN - 9781479903566

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 6920

EP - 6924

BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings

T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013

Y2 - 26 May 2013 through 31 May 2013

ER -

Transmutative voice conversion

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Cite this