Robust recognition of cellular telephone speech by adaptive vector quantization

Mustafa (Kemal) Sonmez, Raja Rajasekaran, John S. Baras

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

The performance degradation as a result of acoustical environment mismatch remains an important practical problem in speech recognition. The problem carries a greater significance in applications over telecommunication channels, especially with the wider use of personal communications systems such as cellular phones which invariably present challenging acoustical conditions. In this work, we introduce a vector quantization (VQ) based compensation technique which both makes use of a priori information about likely acoustical environments and adapts to the test environment to improve recognition. The technique is progressive and requires neither simultaneously recorded speech from the training and the testing environments nor EM-type batch iterations. Instead of using simultaneously recorded data, the integrity of the updated VQ codebooks with respect to acoustical classes is maintained by endowing the codebooks with a topology and using transformations which preserve the topology of the reference environment. We report results on the McCaw Cellular Corpus where the technique decreases the word error for continuous ten digit recognition of cellular hands free microphone speech with land line trained models from 23.8% to 13.6% and the speaker dependent voice calling sentence error from 16.5% to 10.6%.

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
PublisherIEEE
Pages503-506
Number of pages4
Volume1
StatePublished - 1996
Externally publishedYes
EventProceedings of the 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 6) - Atlanta, GA, USA
Duration: May 7 1996May 10 1996

Other

OtherProceedings of the 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 6)
CityAtlanta, GA, USA
Period5/7/965/10/96

Fingerprint

Cellular telephones
vector quantization
telephones
Vector quantization
Topology
Personal communication systems
Telephone lines
Microphones
telecommunication
Speech recognition
topology
Telecommunication
sentences
digits
speech recognition
Degradation
microphones
integrity
Testing
iteration

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

Sonmez, M. K., Rajasekaran, R., & Baras, J. S. (1996). Robust recognition of cellular telephone speech by adaptive vector quantization. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 1, pp. 503-506). IEEE.

Robust recognition of cellular telephone speech by adaptive vector quantization. / Sonmez, Mustafa (Kemal); Rajasekaran, Raja; Baras, John S.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 IEEE, 1996. p. 503-506.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sonmez, MK, Rajasekaran, R & Baras, JS 1996, Robust recognition of cellular telephone speech by adaptive vector quantization. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 1, IEEE, pp. 503-506, Proceedings of the 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP. Part 1 (of 6), Atlanta, GA, USA, 5/7/96.
Sonmez MK, Rajasekaran R, Baras JS. Robust recognition of cellular telephone speech by adaptive vector quantization. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1. IEEE. 1996. p. 503-506
Sonmez, Mustafa (Kemal) ; Rajasekaran, Raja ; Baras, John S. / Robust recognition of cellular telephone speech by adaptive vector quantization. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 1 IEEE, 1996. pp. 503-506
@inproceedings{32cb7045393246929ef5101bded54a72,
title = "Robust recognition of cellular telephone speech by adaptive vector quantization",
abstract = "The performance degradation as a result of acoustical environment mismatch remains an important practical problem in speech recognition. The problem carries a greater significance in applications over telecommunication channels, especially with the wider use of personal communications systems such as cellular phones which invariably present challenging acoustical conditions. In this work, we introduce a vector quantization (VQ) based compensation technique which both makes use of a priori information about likely acoustical environments and adapts to the test environment to improve recognition. The technique is progressive and requires neither simultaneously recorded speech from the training and the testing environments nor EM-type batch iterations. Instead of using simultaneously recorded data, the integrity of the updated VQ codebooks with respect to acoustical classes is maintained by endowing the codebooks with a topology and using transformations which preserve the topology of the reference environment. We report results on the McCaw Cellular Corpus where the technique decreases the word error for continuous ten digit recognition of cellular hands free microphone speech with land line trained models from 23.8{\%} to 13.6{\%} and the speaker dependent voice calling sentence error from 16.5{\%} to 10.6{\%}.",
author = "Sonmez, {Mustafa (Kemal)} and Raja Rajasekaran and Baras, {John S.}",
year = "1996",
language = "English (US)",
volume = "1",
pages = "503--506",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "IEEE",

}

TY - GEN

T1 - Robust recognition of cellular telephone speech by adaptive vector quantization

AU - Sonmez, Mustafa (Kemal)

AU - Rajasekaran, Raja

AU - Baras, John S.

PY - 1996

Y1 - 1996

N2 - The performance degradation as a result of acoustical environment mismatch remains an important practical problem in speech recognition. The problem carries a greater significance in applications over telecommunication channels, especially with the wider use of personal communications systems such as cellular phones which invariably present challenging acoustical conditions. In this work, we introduce a vector quantization (VQ) based compensation technique which both makes use of a priori information about likely acoustical environments and adapts to the test environment to improve recognition. The technique is progressive and requires neither simultaneously recorded speech from the training and the testing environments nor EM-type batch iterations. Instead of using simultaneously recorded data, the integrity of the updated VQ codebooks with respect to acoustical classes is maintained by endowing the codebooks with a topology and using transformations which preserve the topology of the reference environment. We report results on the McCaw Cellular Corpus where the technique decreases the word error for continuous ten digit recognition of cellular hands free microphone speech with land line trained models from 23.8% to 13.6% and the speaker dependent voice calling sentence error from 16.5% to 10.6%.

AB - The performance degradation as a result of acoustical environment mismatch remains an important practical problem in speech recognition. The problem carries a greater significance in applications over telecommunication channels, especially with the wider use of personal communications systems such as cellular phones which invariably present challenging acoustical conditions. In this work, we introduce a vector quantization (VQ) based compensation technique which both makes use of a priori information about likely acoustical environments and adapts to the test environment to improve recognition. The technique is progressive and requires neither simultaneously recorded speech from the training and the testing environments nor EM-type batch iterations. Instead of using simultaneously recorded data, the integrity of the updated VQ codebooks with respect to acoustical classes is maintained by endowing the codebooks with a topology and using transformations which preserve the topology of the reference environment. We report results on the McCaw Cellular Corpus where the technique decreases the word error for continuous ten digit recognition of cellular hands free microphone speech with land line trained models from 23.8% to 13.6% and the speaker dependent voice calling sentence error from 16.5% to 10.6%.

UR - http://www.scopus.com/inward/record.url?scp=0029750934&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0029750934&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0029750934

VL - 1

SP - 503

EP - 506

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

PB - IEEE

ER -