Speech recognition as feature extraction for speaker recognition

A. Stolcke; E. Shriberg; L. Ferrer; S. Kajarekar; K. Sonmez; G. Tur

Speech recognition as feature extraction for speaker recognition

A. Stolcke, E. Shriberg, L. Ferrer, S. Kajarekar, K. Sonmez, G. Tur

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Information from speech recognition can be used in various ways in state-of-the-art speaker recognition systems. This includes the obvious use of recognized words to enable the use of text-dependent speaker modeling techniques when the words spoken are not given. Furthermore, it has been shown that the choice of words and phones itself can be a useful indicator of speaker identity. Also, recognizer output enables higher-level features, in particular those related to prosodic properties of speech. Finally, we discuss the use of mere byproducts of word recognition, such as subword unit alignments, pronunciations, and speaker adaptation transforms to derive powerful nonstandard features for speaker modeling. We present specific techniques and results from SRI's NIST speaker recognition evaluation system.

Original language	English (US)
Title of host publication	Proceedings - SAFE 2007
Subtitle of host publication	Workshop on Signal Processing Applications for Public Security and Forensics
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	1424412269, 9781424412266
State	Published - 2007
Externally published	Yes
Event	Workshop on Signal Processing Applications for Public Security and Forensics, SAFE 2007 - Washington, United States Duration: Apr 11 2007 → Apr 13 2007

Publication series

Name	Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics

Other

Other	Workshop on Signal Processing Applications for Public Security and Forensics, SAFE 2007
Country/Territory	United States
City	Washington
Period	4/11/07 → 4/13/07

Keywords

High-level features
Prosody
Speaker adaptation
Speaker recognition
Speech recognition

ASJC Scopus subject areas

Law
Computer Vision and Pattern Recognition
Signal Processing

Cite this

Stolcke, A., Shriberg, E., Ferrer, L., Kajarekar, S., Sonmez, K., & Tur, G. (2007). Speech recognition as feature extraction for speaker recognition. In Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics Article 4218939 (Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics). Institute of Electrical and Electronics Engineers Inc..

Speech recognition as feature extraction for speaker recognition. / Stolcke, A.; Shriberg, E.; Ferrer, L. et al.
Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics. Institute of Electrical and Electronics Engineers Inc., 2007. 4218939 (Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Stolcke, A, Shriberg, E, Ferrer, L, Kajarekar, S, Sonmez, K & Tur, G 2007, Speech recognition as feature extraction for speaker recognition. in Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics., 4218939, Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics, Institute of Electrical and Electronics Engineers Inc., Workshop on Signal Processing Applications for Public Security and Forensics, SAFE 2007, Washington, United States, 4/11/07.

Stolcke A, Shriberg E, Ferrer L, Kajarekar S, Sonmez K, Tur G. Speech recognition as feature extraction for speaker recognition. In Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics. Institute of Electrical and Electronics Engineers Inc. 2007. 4218939. (Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics).

Stolcke, A. ; Shriberg, E. ; Ferrer, L. et al. / Speech recognition as feature extraction for speaker recognition. Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics. Institute of Electrical and Electronics Engineers Inc., 2007. (Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics).

@inproceedings{1231b254646941b0a73b71039d87aaf8,

title = "Speech recognition as feature extraction for speaker recognition",

abstract = "Information from speech recognition can be used in various ways in state-of-the-art speaker recognition systems. This includes the obvious use of recognized words to enable the use of text-dependent speaker modeling techniques when the words spoken are not given. Furthermore, it has been shown that the choice of words and phones itself can be a useful indicator of speaker identity. Also, recognizer output enables higher-level features, in particular those related to prosodic properties of speech. Finally, we discuss the use of mere byproducts of word recognition, such as subword unit alignments, pronunciations, and speaker adaptation transforms to derive powerful nonstandard features for speaker modeling. We present specific techniques and results from SRI's NIST speaker recognition evaluation system.",

keywords = "High-level features, Prosody, Speaker adaptation, Speaker recognition, Speech recognition",

author = "A. Stolcke and E. Shriberg and L. Ferrer and S. Kajarekar and K. Sonmez and G. Tur",

note = "Publisher Copyright: {\textcopyright} 2007 IEEE.; Workshop on Signal Processing Applications for Public Security and Forensics, SAFE 2007 ; Conference date: 11-04-2007 Through 13-04-2007",

year = "2007",

language = "English (US)",

series = "Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "Proceedings - SAFE 2007",

}

TY - GEN

T1 - Speech recognition as feature extraction for speaker recognition

AU - Stolcke, A.

AU - Shriberg, E.

AU - Ferrer, L.

AU - Kajarekar, S.

AU - Sonmez, K.

AU - Tur, G.

PY - 2007

Y1 - 2007

N2 - Information from speech recognition can be used in various ways in state-of-the-art speaker recognition systems. This includes the obvious use of recognized words to enable the use of text-dependent speaker modeling techniques when the words spoken are not given. Furthermore, it has been shown that the choice of words and phones itself can be a useful indicator of speaker identity. Also, recognizer output enables higher-level features, in particular those related to prosodic properties of speech. Finally, we discuss the use of mere byproducts of word recognition, such as subword unit alignments, pronunciations, and speaker adaptation transforms to derive powerful nonstandard features for speaker modeling. We present specific techniques and results from SRI's NIST speaker recognition evaluation system.

AB - Information from speech recognition can be used in various ways in state-of-the-art speaker recognition systems. This includes the obvious use of recognized words to enable the use of text-dependent speaker modeling techniques when the words spoken are not given. Furthermore, it has been shown that the choice of words and phones itself can be a useful indicator of speaker identity. Also, recognizer output enables higher-level features, in particular those related to prosodic properties of speech. Finally, we discuss the use of mere byproducts of word recognition, such as subword unit alignments, pronunciations, and speaker adaptation transforms to derive powerful nonstandard features for speaker modeling. We present specific techniques and results from SRI's NIST speaker recognition evaluation system.

KW - High-level features

KW - Prosody

KW - Speaker adaptation

KW - Speaker recognition

KW - Speech recognition

UR - http://www.scopus.com/inward/record.url?scp=84969260568&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84969260568&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84969260568

T3 - Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics

BT - Proceedings - SAFE 2007

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - Workshop on Signal Processing Applications for Public Security and Forensics, SAFE 2007

Y2 - 11 April 2007 through 13 April 2007

ER -

Speech recognition as feature extraction for speaker recognition

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this