TY - JOUR
T1 - Improvements to harmonic model for extracting better speech features in clinical applications
AU - Asgari, Meysam
AU - Shafran, Izhak
N1 - Publisher Copyright:
© 2018
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2018/1
Y1 - 2018/1
N2 - Acoustic properties of speech samples can provide important cues in the assessment of voice pathology and cognitive function. The goal of this study is to develop novel algorithms for robust and accurate estimation of speech features and employ them to build probabilistic speech models for characterizing and analyzing clinical speech. Toward this goal, we adopt a harmonic model (HM) of speech. We overcome certain drawbacks of this model and introduce an improved version of HM that leads us to accurate and reliable estimation of voiced segments, fundamental frequency, HNR, jitter, and shimmer. We evaluate the performance of our improved HM in the context of voicing detection and pitch estimation with other state-of-the-art techniques on the Keele data set. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments. Next, we investigate the utility of developed measures on the speech-based assessment of cognitive impairments including clinical depression and autism spectrum disorder (ASD). Our preliminary results on two clinical tasks demonstrate the promise of our improved HM features in practical applications.
AB - Acoustic properties of speech samples can provide important cues in the assessment of voice pathology and cognitive function. The goal of this study is to develop novel algorithms for robust and accurate estimation of speech features and employ them to build probabilistic speech models for characterizing and analyzing clinical speech. Toward this goal, we adopt a harmonic model (HM) of speech. We overcome certain drawbacks of this model and introduce an improved version of HM that leads us to accurate and reliable estimation of voiced segments, fundamental frequency, HNR, jitter, and shimmer. We evaluate the performance of our improved HM in the context of voicing detection and pitch estimation with other state-of-the-art techniques on the Keele data set. Through extensive experiments on several noisy conditions, we demonstrate that the proposed improvements provide substantial gains over other popular methods under different noise levels and environments. Next, we investigate the utility of developed measures on the speech-based assessment of cognitive impairments including clinical depression and autism spectrum disorder (ASD). Our preliminary results on two clinical tasks demonstrate the promise of our improved HM features in practical applications.
KW - Modified harmonic model
KW - Pitch tracking
KW - Voice activity detection
UR - http://www.scopus.com/inward/record.url?scp=85029516900&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029516900&partnerID=8YFLogxK
U2 - 10.1016/j.csl.2017.08.005
DO - 10.1016/j.csl.2017.08.005
M3 - Article
AN - SCOPUS:85029516900
VL - 47
SP - 298
EP - 313
JO - Computer Speech and Language
JF - Computer Speech and Language
SN - 0885-2308
ER -