Perceptual cost function for cross-fading based concatenation

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In earlier research, we applied a linear weighted cross-fading function to ensure smooth concatenation. However, this can cause unnaturally shaped spectral trajectories. We propose context-sensitive cross-fading. To train this system, a perceptually validated cost function is needed, which is the focus of this paper. A corpus was designed to generate a variety of formant trajectory shapes. A perceptual experiment was performed and a multiple linear regression model was applied to predict perceptual quality ratings from various distances between cross-faded and natural trajectories. Results show that perceptual quality could be predicted well from the proposed distance measures.

Original languageEnglish (US)
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Pages732-735
Number of pages4
StatePublished - 2009
Event10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009 - Brighton, United Kingdom
Duration: Sep 6 2009Sep 10 2009

Other

Other10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009
CountryUnited Kingdom
CityBrighton
Period9/6/099/10/09

Fingerprint

Cost functions
Linear Models
Trajectories
Costs and Cost Analysis
Linear regression
Research
Experiments

Keywords

  • Concatenation errors
  • Cross-fading function
  • Formant frequency
  • Perceptual score

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Cite this

Miao, Q., Kain, A., & Van Santen, J. (2009). Perceptual cost function for cross-fading based concatenation. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 732-735)

Perceptual cost function for cross-fading based concatenation. / Miao, Qi; Kain, Alexander; Van Santen, Jan.

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 732-735.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Miao, Q, Kain, A & Van Santen, J 2009, Perceptual cost function for cross-fading based concatenation. in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. pp. 732-735, 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, Brighton, United Kingdom, 9/6/09.
Miao Q, Kain A, Van Santen J. Perceptual cost function for cross-fading based concatenation. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. p. 732-735
Miao, Qi ; Kain, Alexander ; Van Santen, Jan. / Perceptual cost function for cross-fading based concatenation. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2009. pp. 732-735
@inproceedings{042b6e47fd8a4c7eb23c184f2aead947,
title = "Perceptual cost function for cross-fading based concatenation",
abstract = "In earlier research, we applied a linear weighted cross-fading function to ensure smooth concatenation. However, this can cause unnaturally shaped spectral trajectories. We propose context-sensitive cross-fading. To train this system, a perceptually validated cost function is needed, which is the focus of this paper. A corpus was designed to generate a variety of formant trajectory shapes. A perceptual experiment was performed and a multiple linear regression model was applied to predict perceptual quality ratings from various distances between cross-faded and natural trajectories. Results show that perceptual quality could be predicted well from the proposed distance measures.",
keywords = "Concatenation errors, Cross-fading function, Formant frequency, Perceptual score",
author = "Qi Miao and Alexander Kain and {Van Santen}, Jan",
year = "2009",
language = "English (US)",
pages = "732--735",
booktitle = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",

}

TY - GEN

T1 - Perceptual cost function for cross-fading based concatenation

AU - Miao, Qi

AU - Kain, Alexander

AU - Van Santen, Jan

PY - 2009

Y1 - 2009

N2 - In earlier research, we applied a linear weighted cross-fading function to ensure smooth concatenation. However, this can cause unnaturally shaped spectral trajectories. We propose context-sensitive cross-fading. To train this system, a perceptually validated cost function is needed, which is the focus of this paper. A corpus was designed to generate a variety of formant trajectory shapes. A perceptual experiment was performed and a multiple linear regression model was applied to predict perceptual quality ratings from various distances between cross-faded and natural trajectories. Results show that perceptual quality could be predicted well from the proposed distance measures.

AB - In earlier research, we applied a linear weighted cross-fading function to ensure smooth concatenation. However, this can cause unnaturally shaped spectral trajectories. We propose context-sensitive cross-fading. To train this system, a perceptually validated cost function is needed, which is the focus of this paper. A corpus was designed to generate a variety of formant trajectory shapes. A perceptual experiment was performed and a multiple linear regression model was applied to predict perceptual quality ratings from various distances between cross-faded and natural trajectories. Results show that perceptual quality could be predicted well from the proposed distance measures.

KW - Concatenation errors

KW - Cross-fading function

KW - Formant frequency

KW - Perceptual score

UR - http://www.scopus.com/inward/record.url?scp=70450161987&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70450161987&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:70450161987

SP - 732

EP - 735

BT - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

ER -