Nonlinear discriminant feature extraction for robust text-independent speaker recognition

Yochai Konig; Larry Heck; Mitch Weintraub; Kemal Sonmez

Nonlinear discriminant feature extraction for robust text-independent speaker recognition

Yochai Konig, Larry Heck, Mitch Weintraub, Kemal Sonmez

Research output: Contribution to conference › Paper › peer-review

Abstract

We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perception (MLP) to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features (e.g., several frames) to a lower-dimensional feature set. The extracted features are optimized to discriminate between speakers and to be robust to mismatched training and testing conditions. We train the MLP on a development set and apply it to the training and testing utterances. Our results show that by combining the NLDA-based system with a state of the art cepstrumbased system we improve the speaker verification performance on the 1997 NIST Speaker Recognition Evaluation set by 15% in average compared with our cepstrum-only system.

Original language	English (US)
Pages	72-75
Number of pages	4
State	Published - 2020
Event	Workshop on Speaker Recognition and its Commercial and Forensic Applications, RLA2C 1998 - Avignon, France Duration: Apr 20 1998 → Apr 23 1998

Conference

Conference	Workshop on Speaker Recognition and its Commercial and Forensic Applications, RLA2C 1998
Country/Territory	France
City	Avignon
Period	4/20/98 → 4/23/98

ASJC Scopus subject areas

Human-Computer Interaction
Signal Processing
Software

Cite this

@conference{1f0ae48d02f44db69a90845699902f9d,

title = "Nonlinear discriminant feature extraction for robust text-independent speaker recognition",

abstract = "We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perception (MLP) to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features (e.g., several frames) to a lower-dimensional feature set. The extracted features are optimized to discriminate between speakers and to be robust to mismatched training and testing conditions. We train the MLP on a development set and apply it to the training and testing utterances. Our results show that by combining the NLDA-based system with a state of the art cepstrumbased system we improve the speaker verification performance on the 1997 NIST Speaker Recognition Evaluation set by 15% in average compared with our cepstrum-only system.",

author = "Yochai Konig and Larry Heck and Mitch Weintraub and Kemal Sonmez",

note = "Publisher Copyright: Copyright {\textcopyright} RLA2C 1998 - Speaker Recognition and its Commercial and Forensic Applications.All rights reserved.; Workshop on Speaker Recognition and its Commercial and Forensic Applications, RLA2C 1998 ; Conference date: 20-04-1998 Through 23-04-1998",

year = "2020",

language = "English (US)",

pages = "72--75",

}

TY - CONF

T1 - Nonlinear discriminant feature extraction for robust text-independent speaker recognition

AU - Konig, Yochai

AU - Heck, Larry

AU - Weintraub, Mitch

AU - Sonmez, Kemal

PY - 2020

Y1 - 2020

N2 - We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perception (MLP) to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features (e.g., several frames) to a lower-dimensional feature set. The extracted features are optimized to discriminate between speakers and to be robust to mismatched training and testing conditions. We train the MLP on a development set and apply it to the training and testing utterances. Our results show that by combining the NLDA-based system with a state of the art cepstrumbased system we improve the speaker verification performance on the 1997 NIST Speaker Recognition Evaluation set by 15% in average compared with our cepstrum-only system.

AB - We study a nonlinear discriminant analysis (NLDA) technique that extracts a speaker-discriminant feature set. Our approach is to train a multilayer perception (MLP) to maximize the separation between speakers by nonlinearly projecting a large set of acoustic features (e.g., several frames) to a lower-dimensional feature set. The extracted features are optimized to discriminate between speakers and to be robust to mismatched training and testing conditions. We train the MLP on a development set and apply it to the training and testing utterances. Our results show that by combining the NLDA-based system with a state of the art cepstrumbased system we improve the speaker verification performance on the 1997 NIST Speaker Recognition Evaluation set by 15% in average compared with our cepstrum-only system.

UR - http://www.scopus.com/inward/record.url?scp=85031612718&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85031612718&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85031612718

SP - 72

EP - 75

T2 - Workshop on Speaker Recognition and its Commercial and Forensic Applications, RLA2C 1998

Y2 - 20 April 1998 through 23 April 1998

ER -

Nonlinear discriminant feature extraction for robust text-independent speaker recognition

Abstract

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this