Refining automatic speech recognition system for older adults

Liu Chen, Meysam Asgari

Research output: Contribution to journalConference articlepeer-review

Abstract

Building a high quality automatic speech recognition (ASR) system with limited training data has been a challenging task particularly for a narrow target population. Open-sourced ASR systems, trained on sufficient data from adults, are susceptible on seniors’ speech due to acoustic mismatch between adults and seniors. With 12 hours of training data, we attempt to develop an ASR system for socially isolated seniors (80+ years old) with possible cognitive impairments. We experimentally identify that ASR for the adult population performs poorly on our target population and transfer learning (TL) can boost the system’s performance. Standing on the fundamental idea of TL, tuning model parameters, we further improve the system by leveraging an attention mechanism to utilize the model’s intermediate information. Our approach achieves 1.58% absolute improvements over the TL model.

Original languageEnglish (US)
Pages (from-to)7003-7007
Number of pages5
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2021-June
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
Duration: Jun 6 2021Jun 11 2021

Keywords

  • Attention mechanism
  • Automatic speech recognition
  • Senior population
  • Small training data
  • Transfer learning

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Refining automatic speech recognition system for older adults'. Together they form a unique fingerprint.

Cite this