Abstract
We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perception (MLP) to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features (e.g., several frames) to a lower-dimensional feature set. The extracted features are optimized to discriminate between speakers and to be robust to mismatched training and testing conditions. We train the MLP on a development set and apply it to the training and testing utterances. Our results show that by combining the NLDA-based system with a state of the art cepstrumbased system we improve the speaker verification performance on the 1997 NIST Speaker Recognition Evaluation set by 15% in average compared with our cepstrum-only system.
Original language | English (US) |
---|---|
Pages | 72-75 |
Number of pages | 4 |
State | Published - 2020 |
Event | Workshop on Speaker Recognition and its Commercial and Forensic Applications, RLA2C 1998 - Avignon, France Duration: Apr 20 1998 → Apr 23 1998 |
Conference
Conference | Workshop on Speaker Recognition and its Commercial and Forensic Applications, RLA2C 1998 |
---|---|
Country/Territory | France |
City | Avignon |
Period | 4/20/98 → 4/23/98 |
ASJC Scopus subject areas
- Human-Computer Interaction
- Signal Processing
- Software