Reconstructing speech from human auditory cortex

Brian N. Pasley, Stephen David, Nima Mesgarani, Adeen Flinker, Shihab A. Shamma, Nathan E. Crone, Robert T. Knight, Edward F. Chang

Research output: Contribution to journalArticle

279 Citations (Scopus)

Abstract

How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.

Original languageEnglish (US)
Article numbere1001251
JournalPLoS Biology
Volume10
Issue number1
DOIs
StatePublished - Jan 2012
Externally publishedYes

Fingerprint

Auditory Cortex
cortex
Speech Acoustics
Acoustics
Acoustic waves
acoustics
Speech Intelligibility
Speech intelligibility
Phonetics
Temporal Lobe
Linear Models
Brain
Modulation
linear models
brain
Population
energy
extracts

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Neuroscience(all)

Cite this

Pasley, B. N., David, S., Mesgarani, N., Flinker, A., Shamma, S. A., Crone, N. E., ... Chang, E. F. (2012). Reconstructing speech from human auditory cortex. PLoS Biology, 10(1), [e1001251]. https://doi.org/10.1371/journal.pbio.1001251

Reconstructing speech from human auditory cortex. / Pasley, Brian N.; David, Stephen; Mesgarani, Nima; Flinker, Adeen; Shamma, Shihab A.; Crone, Nathan E.; Knight, Robert T.; Chang, Edward F.

In: PLoS Biology, Vol. 10, No. 1, e1001251, 01.2012.

Research output: Contribution to journalArticle

Pasley, BN, David, S, Mesgarani, N, Flinker, A, Shamma, SA, Crone, NE, Knight, RT & Chang, EF 2012, 'Reconstructing speech from human auditory cortex', PLoS Biology, vol. 10, no. 1, e1001251. https://doi.org/10.1371/journal.pbio.1001251
Pasley BN, David S, Mesgarani N, Flinker A, Shamma SA, Crone NE et al. Reconstructing speech from human auditory cortex. PLoS Biology. 2012 Jan;10(1). e1001251. https://doi.org/10.1371/journal.pbio.1001251
Pasley, Brian N. ; David, Stephen ; Mesgarani, Nima ; Flinker, Adeen ; Shamma, Shihab A. ; Crone, Nathan E. ; Knight, Robert T. ; Chang, Edward F. / Reconstructing speech from human auditory cortex. In: PLoS Biology. 2012 ; Vol. 10, No. 1.
@article{4bbbd93a4aa04a6abb45973ce795cdc5,
title = "Reconstructing speech from human auditory cortex",
abstract = "How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.",
author = "Pasley, {Brian N.} and Stephen David and Nima Mesgarani and Adeen Flinker and Shamma, {Shihab A.} and Crone, {Nathan E.} and Knight, {Robert T.} and Chang, {Edward F.}",
year = "2012",
month = "1",
doi = "10.1371/journal.pbio.1001251",
language = "English (US)",
volume = "10",
journal = "PLoS Biology",
issn = "1544-9173",
publisher = "Public Library of Science",
number = "1",

}

TY - JOUR

T1 - Reconstructing speech from human auditory cortex

AU - Pasley, Brian N.

AU - David, Stephen

AU - Mesgarani, Nima

AU - Flinker, Adeen

AU - Shamma, Shihab A.

AU - Crone, Nathan E.

AU - Knight, Robert T.

AU - Chang, Edward F.

PY - 2012/1

Y1 - 2012/1

N2 - How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.

AB - How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.

UR - http://www.scopus.com/inward/record.url?scp=84856467482&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84856467482&partnerID=8YFLogxK

U2 - 10.1371/journal.pbio.1001251

DO - 10.1371/journal.pbio.1001251

M3 - Article

C2 - 22303281

AN - SCOPUS:84856467482

VL - 10

JO - PLoS Biology

JF - PLoS Biology

SN - 1544-9173

IS - 1

M1 - e1001251

ER -