A novel pitch decomposition method for the generalized linear alignment model

Mahsa Sadat Elyasi Langarani, Esther Klabbers, Jan Van Santen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

Superpositional models of intonation typically propose decomposing fundamental frequency (F0) contours into phrase curves and accent curves, aligned with phrases and left-headed feet, respectively. Extracting these component curves from F0 contours without making undue assumptions is challenging. We propose a novel method for decomposing pitch curves, based on the assumption that accent curves can be described by combining skewed normal distributions and sigmoid functions. In contrast to an earlier pitch decomposition algorithm ('PRISM'), this allows for simple joint optimization of phrase and accent curve parameters, using fewer parameters. The proposed method was evaluated on three speech corpora containing: (1) synthetically generated pitch curves, (2) all-sonorant utterances, and (3) utterances containing both sonorant and non-sonorant speech sounds. The root weighted mean squared error is small, and, on the corpus for which comparable data are available, is significantly smaller than for PRISM.

Original languageEnglish (US)
Title of host publication2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2584-2588
Number of pages5
ISBN (Print)9781479928927
DOIs
StatePublished - Jan 1 2014
Event2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy
Duration: May 4 2014May 9 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
CountryItaly
CityFlorence
Period5/4/145/9/14

    Fingerprint

Keywords

  • prosody modeling
  • superpositional model
  • text-to-speech synthesis

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Langarani, M. S. E., Klabbers, E., & Van Santen, J. (2014). A novel pitch decomposition method for the generalized linear alignment model. In 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 (pp. 2584-2588). [6854067] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2014.6854067