A novel pitch decomposition method for the generalized linear alignment model

Mahsa Sadat Elyasi Langarani; Esther Klabbers; Jan Van Santen

doi:10.1109/ICASSP.2014.6854067

A novel pitch decomposition method for the generalized linear alignment model

Mahsa Sadat Elyasi Langarani, Esther Klabbers, Jan Van Santen

Institute on Development and Disability

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

9 Scopus citations

Abstract

Superpositional models of intonation typically propose decomposing fundamental frequency (F₀) contours into phrase curves and accent curves, aligned with phrases and left-headed feet, respectively. Extracting these component curves from F₀ contours without making undue assumptions is challenging. We propose a novel method for decomposing pitch curves, based on the assumption that accent curves can be described by combining skewed normal distributions and sigmoid functions. In contrast to an earlier pitch decomposition algorithm ('PRISM'), this allows for simple joint optimization of phrase and accent curve parameters, using fewer parameters. The proposed method was evaluated on three speech corpora containing: (1) synthetically generated pitch curves, (2) all-sonorant utterances, and (3) utterances containing both sonorant and non-sonorant speech sounds. The root weighted mean squared error is small, and, on the corpus for which comparable data are available, is significantly smaller than for PRISM.

Original language	English (US)
Title of host publication	2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	2584-2588
Number of pages	5
ISBN (Print)	9781479928927
DOIs	https://doi.org/10.1109/ICASSP.2014.6854067
State	Published - 2014
Event	2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy Duration: May 4 2014 → May 9 2014

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)	1520-6149

Other

Other	2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
Country/Territory	Italy
City	Florence
Period	5/4/14 → 5/9/14

Keywords

prosody modeling
superpositional model
text-to-speech synthesis

ASJC Scopus subject areas

Software
Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/ICASSP.2014.6854067

Cite this

Langarani, M. S. E., Klabbers, E., & Van Santen, J. (2014). A novel pitch decomposition method for the generalized linear alignment model. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 (pp. 2584-2588). Article 6854067 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2014.6854067

A novel pitch decomposition method for the generalized linear alignment model. / Langarani, Mahsa Sadat Elyasi; Klabbers, Esther; Van Santen, Jan.
2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. p. 2584-2588 6854067 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Langarani, MSE, Klabbers, E & Van Santen, J 2014, A novel pitch decomposition method for the generalized linear alignment model. in 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014., 6854067, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 2584-2588, 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014, Florence, Italy, 5/4/14. https://doi.org/10.1109/ICASSP.2014.6854067

Langarani MSE, Klabbers E, Van Santen J. A novel pitch decomposition method for the generalized linear alignment model. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc. 2014. p. 2584-2588. 6854067. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP.2014.6854067

Langarani, Mahsa Sadat Elyasi ; Klabbers, Esther ; Van Santen, Jan. / A novel pitch decomposition method for the generalized linear alignment model. 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 2584-2588 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

@inproceedings{c58f4d52ad214435b7165c66e8ac8409,

title = "A novel pitch decomposition method for the generalized linear alignment model",

abstract = "Superpositional models of intonation typically propose decomposing fundamental frequency (F0) contours into phrase curves and accent curves, aligned with phrases and left-headed feet, respectively. Extracting these component curves from F0 contours without making undue assumptions is challenging. We propose a novel method for decomposing pitch curves, based on the assumption that accent curves can be described by combining skewed normal distributions and sigmoid functions. In contrast to an earlier pitch decomposition algorithm ('PRISM'), this allows for simple joint optimization of phrase and accent curve parameters, using fewer parameters. The proposed method was evaluated on three speech corpora containing: (1) synthetically generated pitch curves, (2) all-sonorant utterances, and (3) utterances containing both sonorant and non-sonorant speech sounds. The root weighted mean squared error is small, and, on the corpus for which comparable data are available, is significantly smaller than for PRISM.",

keywords = "prosody modeling, superpositional model, text-to-speech synthesis",

author = "Langarani, {Mahsa Sadat Elyasi} and Esther Klabbers and {Van Santen}, Jan",

year = "2014",

doi = "10.1109/ICASSP.2014.6854067",

language = "English (US)",

isbn = "9781479928927",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "2584--2588",

booktitle = "2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014",

note = "2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 ; Conference date: 04-05-2014 Through 09-05-2014",

}

TY - GEN

T1 - A novel pitch decomposition method for the generalized linear alignment model

AU - Langarani, Mahsa Sadat Elyasi

AU - Klabbers, Esther

AU - Van Santen, Jan

PY - 2014

Y1 - 2014

N2 - Superpositional models of intonation typically propose decomposing fundamental frequency (F0) contours into phrase curves and accent curves, aligned with phrases and left-headed feet, respectively. Extracting these component curves from F0 contours without making undue assumptions is challenging. We propose a novel method for decomposing pitch curves, based on the assumption that accent curves can be described by combining skewed normal distributions and sigmoid functions. In contrast to an earlier pitch decomposition algorithm ('PRISM'), this allows for simple joint optimization of phrase and accent curve parameters, using fewer parameters. The proposed method was evaluated on three speech corpora containing: (1) synthetically generated pitch curves, (2) all-sonorant utterances, and (3) utterances containing both sonorant and non-sonorant speech sounds. The root weighted mean squared error is small, and, on the corpus for which comparable data are available, is significantly smaller than for PRISM.

AB - Superpositional models of intonation typically propose decomposing fundamental frequency (F0) contours into phrase curves and accent curves, aligned with phrases and left-headed feet, respectively. Extracting these component curves from F0 contours without making undue assumptions is challenging. We propose a novel method for decomposing pitch curves, based on the assumption that accent curves can be described by combining skewed normal distributions and sigmoid functions. In contrast to an earlier pitch decomposition algorithm ('PRISM'), this allows for simple joint optimization of phrase and accent curve parameters, using fewer parameters. The proposed method was evaluated on three speech corpora containing: (1) synthetically generated pitch curves, (2) all-sonorant utterances, and (3) utterances containing both sonorant and non-sonorant speech sounds. The root weighted mean squared error is small, and, on the corpus for which comparable data are available, is significantly smaller than for PRISM.

KW - prosody modeling

KW - superpositional model

KW - text-to-speech synthesis

UR - http://www.scopus.com/inward/record.url?scp=84905229466&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905229466&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2014.6854067

DO - 10.1109/ICASSP.2014.6854067

M3 - Conference contribution

AN - SCOPUS:84905229466

SN - 9781479928927

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 2584

EP - 2588

BT - 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014

Y2 - 4 May 2014 through 9 May 2014

ER -

A novel pitch decomposition method for the generalized linear alignment model

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this