Abstract
We describe a speaker tracking and detection system, for Switchboard conversations, that uses a two-speaker and silence hidden Markov model (HMM) with a minimum state duration constraint and Gaussian mixture model (GMM) state distributions adapted from a single gender- and handset-independent imposter model distribution. Speaker tracking is used to segment speakers for detection, which is carried out by averaging frame scores of the Viterbi path and HNORM'ing via a novel parameter interpolation extension of HNORM for use with files of arbitrary lengths. Use of duration statistics augmenting the acoustic scores is also introduced via a nonlinear combination function. Results are reported on the NIST 1998 Multispeaker development evaluation dataset.
Original language | English (US) |
---|---|
Pages | 2219-2222 |
Number of pages | 4 |
State | Published - 1999 |
Externally published | Yes |
Event | 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 - Budapest, Hungary Duration: Sep 5 1999 → Sep 9 1999 |
Conference
Conference | 6th European Conference on Speech Communication and Technology, EUROSPEECH 1999 |
---|---|
Country/Territory | Hungary |
City | Budapest |
Period | 9/5/99 → 9/9/99 |
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Linguistics and Language
- Communication