F0 range and peak alignment across speakers and emotions

Eric Morley; Jan Van Santen; Esther Klabbers; Alexander Kain

doi:10.1109/ICASSP.2011.5947467

F0 range and peak alignment across speakers and emotions

Eric Morley, Jan Van Santen, Esther Klabbers, Alexander Kain

Institute on Development and Disability

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

3 Scopus citations

Abstract

We present an analysis of F₀ range and peak alignment in emotional speech from a heterogeneous group of speakers varying in age and gender. Both speaker and emotion had a strong effect on F₀ range. Despite these large changes in the F₀ trajectory, peak alignment was remarkably stable. Using the Linear Alignment Model (LAM) [1], we show that the effects on alignment of emotion and speaker differences, although statistically significant, are small. This stability results in a conclusion that peak alignment, unlike F₀ range, does not appear to carry much information about speaker identity or emotional state. The LAM is effective in that it explains 42% of the variance in peak location on average, and furthermore it predicts the time of F₀ peaks with an average RMS error of 12ms.

Original language	English (US)
Title of host publication	2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings
Pages	4952-4955
Number of pages	4
DOIs	https://doi.org/10.1109/ICASSP.2011.5947467
State	Published - 2011
Event	36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Prague, Czech Republic Duration: May 22 2011 → May 27 2011

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)	1520-6149

Other

Other	36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
Country/Territory	Czech Republic
City	Prague
Period	5/22/11 → 5/27/11

Keywords

emotion recognition
human voice
speech analysis
speech synthesis

ASJC Scopus subject areas

Software
Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/ICASSP.2011.5947467

Cite this

Morley, E., Van Santen, J., Klabbers, E., & Kain, A. (2011). F0 range and peak alignment across speakers and emotions. In 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings (pp. 4952-4955). Article 5947467 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2011.5947467

F0 range and peak alignment across speakers and emotions. / Morley, Eric; Van Santen, Jan; Klabbers, Esther et al.
2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. p. 4952-4955 5947467 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Morley, E, Van Santen, J, Klabbers, E & Kain, A 2011, F0 range and peak alignment across speakers and emotions. in 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings., 5947467, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 4952-4955, 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, Prague, Czech Republic, 5/22/11. https://doi.org/10.1109/ICASSP.2011.5947467

Morley E, Van Santen J, Klabbers E, Kain A. F0 range and peak alignment across speakers and emotions. In 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings. 2011. p. 4952-4955. 5947467. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). doi: 10.1109/ICASSP.2011.5947467

@inproceedings{611087d4aa714a5cb78658e94095496a,

title = "F0 range and peak alignment across speakers and emotions",

abstract = "We present an analysis of F0 range and peak alignment in emotional speech from a heterogeneous group of speakers varying in age and gender. Both speaker and emotion had a strong effect on F0 range. Despite these large changes in the F0 trajectory, peak alignment was remarkably stable. Using the Linear Alignment Model (LAM) [1], we show that the effects on alignment of emotion and speaker differences, although statistically significant, are small. This stability results in a conclusion that peak alignment, unlike F0 range, does not appear to carry much information about speaker identity or emotional state. The LAM is effective in that it explains 42% of the variance in peak location on average, and furthermore it predicts the time of F0 peaks with an average RMS error of 12ms.",

keywords = "emotion recognition, human voice, speech analysis, speech synthesis",

author = "Eric Morley and {Van Santen}, Jan and Esther Klabbers and Alexander Kain",

year = "2011",

doi = "10.1109/ICASSP.2011.5947467",

language = "English (US)",

isbn = "9781457705397",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

pages = "4952--4955",

booktitle = "2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings",

note = "36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 ; Conference date: 22-05-2011 Through 27-05-2011",

}

TY - GEN

T1 - F0 range and peak alignment across speakers and emotions

AU - Morley, Eric

AU - Van Santen, Jan

AU - Klabbers, Esther

AU - Kain, Alexander

PY - 2011

Y1 - 2011

N2 - We present an analysis of F0 range and peak alignment in emotional speech from a heterogeneous group of speakers varying in age and gender. Both speaker and emotion had a strong effect on F0 range. Despite these large changes in the F0 trajectory, peak alignment was remarkably stable. Using the Linear Alignment Model (LAM) [1], we show that the effects on alignment of emotion and speaker differences, although statistically significant, are small. This stability results in a conclusion that peak alignment, unlike F0 range, does not appear to carry much information about speaker identity or emotional state. The LAM is effective in that it explains 42% of the variance in peak location on average, and furthermore it predicts the time of F0 peaks with an average RMS error of 12ms.

AB - We present an analysis of F0 range and peak alignment in emotional speech from a heterogeneous group of speakers varying in age and gender. Both speaker and emotion had a strong effect on F0 range. Despite these large changes in the F0 trajectory, peak alignment was remarkably stable. Using the Linear Alignment Model (LAM) [1], we show that the effects on alignment of emotion and speaker differences, although statistically significant, are small. This stability results in a conclusion that peak alignment, unlike F0 range, does not appear to carry much information about speaker identity or emotional state. The LAM is effective in that it explains 42% of the variance in peak location on average, and furthermore it predicts the time of F0 peaks with an average RMS error of 12ms.

KW - emotion recognition

KW - human voice

KW - speech analysis

KW - speech synthesis

UR - http://www.scopus.com/inward/record.url?scp=80051615146&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80051615146&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2011.5947467

DO - 10.1109/ICASSP.2011.5947467

M3 - Conference contribution

AN - SCOPUS:80051615146

SN - 9781457705397

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 4952

EP - 4955

BT - 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings

T2 - 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011

Y2 - 22 May 2011 through 27 May 2011

ER -

F0 range and peak alignment across speakers and emotions

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this