Reconciliation of human and machine speech recognition performance

Misha Pavel, Malcolm Slaney, Hynek Hermansky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Pages1669-1672
Number of pages4
DOIs
StatePublished - 2009
Event2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009 - Taipei, Taiwan, Province of China
Duration: Apr 19 2009Apr 24 2009

Other

Other2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009
CountryTaiwan, Province of China
CityTaipei
Period4/19/094/24/09

Fingerprint

Speech recognition

Keywords

  • Pattern recogntion
  • Speech recogntion

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Pavel, M., Slaney, M., & Hermansky, H. (2009). Reconciliation of human and machine speech recognition performance. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 1669-1672). [4959922] https://doi.org/10.1109/ICASSP.2009.4959922

Reconciliation of human and machine speech recognition performance. / Pavel, Misha; Slaney, Malcolm; Hermansky, Hynek.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2009. p. 1669-1672 4959922.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pavel, M, Slaney, M & Hermansky, H 2009, Reconciliation of human and machine speech recognition performance. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings., 4959922, pp. 1669-1672, 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, Taipei, Taiwan, Province of China, 4/19/09. https://doi.org/10.1109/ICASSP.2009.4959922
Pavel M, Slaney M, Hermansky H. Reconciliation of human and machine speech recognition performance. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2009. p. 1669-1672. 4959922 https://doi.org/10.1109/ICASSP.2009.4959922
Pavel, Misha ; Slaney, Malcolm ; Hermansky, Hynek. / Reconciliation of human and machine speech recognition performance. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 2009. pp. 1669-1672
@inproceedings{2148d6189301446e8861b4d7794143f8,
title = "Reconciliation of human and machine speech recognition performance",
abstract = "This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.",
keywords = "Pattern recogntion, Speech recogntion",
author = "Misha Pavel and Malcolm Slaney and Hynek Hermansky",
year = "2009",
doi = "10.1109/ICASSP.2009.4959922",
language = "English (US)",
isbn = "9781424423545",
pages = "1669--1672",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

}

TY - GEN

T1 - Reconciliation of human and machine speech recognition performance

AU - Pavel, Misha

AU - Slaney, Malcolm

AU - Hermansky, Hynek

PY - 2009

Y1 - 2009

N2 - This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.

AB - This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.

KW - Pattern recogntion

KW - Speech recogntion

UR - http://www.scopus.com/inward/record.url?scp=70349202182&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349202182&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2009.4959922

DO - 10.1109/ICASSP.2009.4959922

M3 - Conference contribution

SN - 9781424423545

SP - 1669

EP - 1672

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

ER -