Improvements to harmonic model for extracting better speech features in clinical applications

Meysam Asgari; Izhak Shafran

doi:10.1016/j.csl.2017.08.005

Improvements to harmonic model for extracting better speech features in clinical applications

Meysam Asgari, Izhak Shafran

Institute on Development and Disability

Research output: Contribution to journal › Article › peer-review

8 Scopus citations

Abstract

Acoustic properties of speech samples can provide important cues in the assessment of voice pathology and cognitive function. The goal of this study is to develop novel algorithms for robust and accurate estimation of speech features and employ them to build probabilistic speech models for characterizing and analyzing clinical speech. Toward this goal, we adopt a harmonic model (HM) of speech. We overcome certain drawbacks of this model and introduce an improved version of HM that leads us to accurate and reliable estimation of voiced segments, fundamental frequency, HNR, jitter, and shimmer. We evaluate the performance of our improved HM in the context of voicing detection and pitch estimation with other state-of-the-art techniques on the Keele data set. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments. Next, we investigate the utility of developed measures on the speech-based assessment of cognitive impairments including clinical depression and autism spectrum disorder (ASD). Our preliminary results on two clinical tasks demonstrate the promise of our improved HM features in practical applications.

Original language	English (US)
Pages (from-to)	298-313
Number of pages	16
Journal	Computer Speech and Language
Volume	47
DOIs	https://doi.org/10.1016/j.csl.2017.08.005
State	Published - Jan 2018

Keywords

Modified harmonic model
Pitch tracking
Voice activity detection

ASJC Scopus subject areas

Software
Theoretical Computer Science
Human-Computer Interaction

Access to Document

10.1016/j.csl.2017.08.005

Cite this

@article{a34b1ec3d10b491ca98dc970e2dadeeb,

title = "Improvements to harmonic model for extracting better speech features in clinical applications",

abstract = "Acoustic properties of speech samples can provide important cues in the assessment of voice pathology and cognitive function. The goal of this study is to develop novel algorithms for robust and accurate estimation of speech features and employ them to build probabilistic speech models for characterizing and analyzing clinical speech. Toward this goal, we adopt a harmonic model (HM) of speech. We overcome certain drawbacks of this model and introduce an improved version of HM that leads us to accurate and reliable estimation of voiced segments, fundamental frequency, HNR, jitter, and shimmer. We evaluate the performance of our improved HM in the context of voicing detection and pitch estimation with other state-of-the-art techniques on the Keele data set. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments. Next, we investigate the utility of developed measures on the speech-based assessment of cognitive impairments including clinical depression and autism spectrum disorder (ASD). Our preliminary results on two clinical tasks demonstrate the promise of our improved HM features in practical applications.",

keywords = "Modified harmonic model, Pitch tracking, Voice activity detection",

author = "Meysam Asgari and Izhak Shafran",

note = "Publisher Copyright: {\textcopyright} 2018",

year = "2018",

month = jan,

doi = "10.1016/j.csl.2017.08.005",

language = "English (US)",

volume = "47",

pages = "298--313",

journal = "Computer Speech and Language",

issn = "0885-2308",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Improvements to harmonic model for extracting better speech features in clinical applications

AU - Asgari, Meysam

AU - Shafran, Izhak

PY - 2018/1

Y1 - 2018/1

N2 - Acoustic properties of speech samples can provide important cues in the assessment of voice pathology and cognitive function. The goal of this study is to develop novel algorithms for robust and accurate estimation of speech features and employ them to build probabilistic speech models for characterizing and analyzing clinical speech. Toward this goal, we adopt a harmonic model (HM) of speech. We overcome certain drawbacks of this model and introduce an improved version of HM that leads us to accurate and reliable estimation of voiced segments, fundamental frequency, HNR, jitter, and shimmer. We evaluate the performance of our improved HM in the context of voicing detection and pitch estimation with other state-of-the-art techniques on the Keele data set. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments. Next, we investigate the utility of developed measures on the speech-based assessment of cognitive impairments including clinical depression and autism spectrum disorder (ASD). Our preliminary results on two clinical tasks demonstrate the promise of our improved HM features in practical applications.

AB - Acoustic properties of speech samples can provide important cues in the assessment of voice pathology and cognitive function. The goal of this study is to develop novel algorithms for robust and accurate estimation of speech features and employ them to build probabilistic speech models for characterizing and analyzing clinical speech. Toward this goal, we adopt a harmonic model (HM) of speech. We overcome certain drawbacks of this model and introduce an improved version of HM that leads us to accurate and reliable estimation of voiced segments, fundamental frequency, HNR, jitter, and shimmer. We evaluate the performance of our improved HM in the context of voicing detection and pitch estimation with other state-of-the-art techniques on the Keele data set. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments. Next, we investigate the utility of developed measures on the speech-based assessment of cognitive impairments including clinical depression and autism spectrum disorder (ASD). Our preliminary results on two clinical tasks demonstrate the promise of our improved HM features in practical applications.

KW - Modified harmonic model

KW - Pitch tracking

KW - Voice activity detection

UR - http://www.scopus.com/inward/record.url?scp=85029516900&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029516900&partnerID=8YFLogxK

U2 - 10.1016/j.csl.2017.08.005

DO - 10.1016/j.csl.2017.08.005

M3 - Article

AN - SCOPUS:85029516900

SN - 0885-2308

VL - 47

SP - 298

EP - 313

JO - Computer Speech and Language

JF - Computer Speech and Language

ER -

Improvements to harmonic model for extracting better speech features in clinical applications

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this