Abstract
Information from speech recognition can be used in various ways in state-of-the-art speaker recognition systems. This includes the obvious use of recognized words to enable the use of text-dependent speaker modeling techniques when the words spoken are not given. Furthermore, it has been shown that the choice of words and phones itself can be a useful indicator of speaker identity. Also, recognizer output enables higher-level features, in particular those related to prosodic properties of speech. Finally, we discuss the use of mere byproducts of word recognition, such as subword unit alignments, pronunciations, and speaker adaptation transforms to derive powerful nonstandard features for speaker modeling. We present specific techniques and results from SRI's NIST speaker recognition evaluation system.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Print) | 1424412269, 9781424412266 |
State | Published - 2007 |
Externally published | Yes |
Event | Workshop on Signal Processing Applications for Public Security and Forensics, SAFE 2007 - Washington, United States Duration: Apr 11 2007 → Apr 13 2007 |
Other
Other | Workshop on Signal Processing Applications for Public Security and Forensics, SAFE 2007 |
---|---|
Country/Territory | United States |
City | Washington |
Period | 4/11/07 → 4/13/07 |
Keywords
- High-level features
- Prosody
- Speaker adaptation
- Speaker recognition
- Speech recognition
ASJC Scopus subject areas
- Law
- Computer Vision and Pattern Recognition
- Signal Processing