Abstract
Accurate and robust estimation of pitch plays a central role in speech processing. Various methods in time, frequency and cepstral domain have been proposed for generating pitch candidates. Most algorithms excel when the background noise is minimal or for specific types of background noise. In this work, our aim is to improve the robustness and accuracy of pitch estimation across a wide variety of background noise conditions. For this we have chosen to adopt, the harmonic model of speech, a model that has gained considerable attention recently. We address two major weakness of this model. The problem of pitch halving and doubling, and the need to specify the number of harmonics. We exploit the energy of frequency in the neighborhood to alleviate halving and doubling. Using a model complexity term with a BIC criterion, we chose the optimal number of harmonics. We evaluated our proposed pitch estimation method with other state of the art techniques on Keele data set in terms of gross pitch error and fine pitch error. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments.
Original language | English (US) |
---|---|
Pages (from-to) | 1936-1940 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
State | Published - 2013 |
Event | 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France Duration: Aug 25 2013 → Aug 29 2013 |
Keywords
- Fundamental frequency estimation
- Robust pitch estimation
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modeling and Simulation