Reconciliation of human and machine speech recognition performance

Misha Pavel; Malcolm Slaney; Hynek Hermansky

doi:10.1109/ICASSP.2009.4959922

Reconciliation of human and machine speech recognition performance

Misha Pavel, Malcolm Slaney, Hynek Hermansky

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.

Original language	English (US)
Title of host publication	2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009
Pages	1669-1672
Number of pages	4
DOIs	https://doi.org/10.1109/ICASSP.2009.4959922
State	Published - 2009
Externally published	Yes
Event	2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009 - Taipei, Taiwan, Province of China Duration: Apr 19 2009 → Apr 24 2009

Publication series

Name	ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)	1520-6149

Other

Other	2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009
Country/Territory	Taiwan, Province of China
City	Taipei
Period	4/19/09 → 4/24/09

Keywords

Pattern recogntion
Speech recogntion

ASJC Scopus subject areas

Software
Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/ICASSP.2009.4959922

Cite this

Pavel, M., Slaney, M., & Hermansky, H. (2009). Reconciliation of human and machine speech recognition performance. In 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009 (pp. 1669-1672). Article 4959922 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2009.4959922

Reconciliation of human and machine speech recognition performance. / Pavel, Misha; Slaney, Malcolm; Hermansky, Hynek.
2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009. 2009. p. 1669-1672 4959922 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Pavel, M, Slaney, M & Hermansky, H 2009, Reconciliation of human and machine speech recognition performance. in 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009., 4959922, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, pp. 1669-1672, 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, Taipei, Taiwan, Province of China, 4/19/09. https://doi.org/10.1109/ICASSP.2009.4959922

@inproceedings{2148d6189301446e8861b4d7794143f8,

title = "Reconciliation of human and machine speech recognition performance",

abstract = "This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.",

keywords = "Pattern recogntion, Speech recogntion",

author = "Misha Pavel and Malcolm Slaney and Hynek Hermansky",

note = "Copyright: Copyright 2012 Elsevier B.V., All rights reserved.; 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009 ; Conference date: 19-04-2009 Through 24-04-2009",

year = "2009",

doi = "10.1109/ICASSP.2009.4959922",

language = "English (US)",

isbn = "9781424423545",

series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

pages = "1669--1672",

booktitle = "2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009",

}

TY - GEN

T1 - Reconciliation of human and machine speech recognition performance

AU - Pavel, Misha

AU - Slaney, Malcolm

AU - Hermansky, Hynek

PY - 2009

Y1 - 2009

N2 - This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.

AB - This paper focuses on resolving a number of issues that appear when the performance of human speech recognition is compared to that of automatic speech recognition. In particular human experimental data suggest that the resulting error is a product of the individual streams. On the other hand, Bayesian combination requires a multiplication of the estimates of prior probabilities and likelihoods. We show that, in principle, there is no discrepancy. The product of errors is a performance measure and human and machine performance may be consistent with this empirically established regularity. The product of probabilities is step in an algorithm to achieve the performance that may or may not be consistent with the product of errors. The main problem is that most of prior discussions failed to distinguish the performance measures from the estimates of the parameters used in the algorithm.

KW - Pattern recogntion

KW - Speech recogntion

UR - http://www.scopus.com/inward/record.url?scp=70349202182&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349202182&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2009.4959922

DO - 10.1109/ICASSP.2009.4959922

M3 - Conference contribution

AN - SCOPUS:70349202182

SN - 9781424423545

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 1669

EP - 1672

BT - 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009

T2 - 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009

Y2 - 19 April 2009 through 24 April 2009

ER -

Reconciliation of human and machine speech recognition performance

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this