Towards direct speech synthesis from ECoG: A pilot study

Christian Herff; Garett Johnson; Lorenz Diener; Jerry Shih; Dean Krusienski; Tanja Schultz

doi:10.1109/EMBC.2016.7591004

Towards direct speech synthesis from ECoG: A pilot study

Christian Herff, Garett Johnson, Lorenz Diener, Jerry Shih, Dean Krusienski, Tanja Schultz

Neurology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

22 Scopus citations

Abstract

Most current Brain-Computer Interfaces (BCIs) achieve high information transfer rates using spelling paradigms based on stimulus-evoked potentials. Despite the success of this interfaces, this mode of communication can be cumbersome and unnatural. Direct synthesis of speech from neural activity represents a more natural mode of communication that would enable users to convey verbal messages in real-time. In this pilot study with one participant, we demonstrate that electrocoticography (ECoG) intracranial activity from temporal areas can be used to resynthesize speech in real-time. This is accomplished by reconstructing the audio magnitude spectrogram from neural activity and subsequently creating the audio waveform from these reconstructed spectrograms. We show that significant correlations between the original and reconstructed spectrograms and temporal waveforms can be achieved. While this pilot study uses audibly spoken speech for the models, it represents a first step towards speech synthesis from speech imagery.

Original language	English (US)
Title of host publication	2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1540-1543
Number of pages	4
ISBN (Electronic)	9781457702204
DOIs	https://doi.org/10.1109/EMBC.2016.7591004
State	Published - Oct 13 2016
Event	38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016 - Orlando, United States Duration: Aug 16 2016 → Aug 20 2016

Publication series

Name	Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
Volume	2016-October
ISSN (Print)	1557-170X

Other

Other	38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016
Country/Territory	United States
City	Orlando
Period	8/16/16 → 8/20/16

ASJC Scopus subject areas

Signal Processing
Biomedical Engineering
Computer Vision and Pattern Recognition
Health Informatics

Access to Document

10.1109/EMBC.2016.7591004

Cite this

Herff, C., Johnson, G., Diener, L., Shih, J., Krusienski, D., & Schultz, T. (2016). Towards direct speech synthesis from ECoG: A pilot study. In 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016 (pp. 1540-1543). Article 7591004 (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS; Vol. 2016-October). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/EMBC.2016.7591004

Towards direct speech synthesis from ECoG: A pilot study. / Herff, Christian; Johnson, Garett; Diener, Lorenz et al.
2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 1540-1543 7591004 (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS; Vol. 2016-October).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Herff, C, Johnson, G, Diener, L, Shih, J, Krusienski, D & Schultz, T 2016, Towards direct speech synthesis from ECoG: A pilot study. in 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016., 7591004, Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, vol. 2016-October, Institute of Electrical and Electronics Engineers Inc., pp. 1540-1543, 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016, Orlando, United States, 8/16/16. https://doi.org/10.1109/EMBC.2016.7591004

Herff C, Johnson G, Diener L, Shih J, Krusienski D, Schultz T. Towards direct speech synthesis from ECoG: A pilot study. In 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 1540-1543. 7591004. (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS). doi: 10.1109/EMBC.2016.7591004

Herff, Christian ; Johnson, Garett ; Diener, Lorenz et al. / Towards direct speech synthesis from ECoG : A pilot study. 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 1540-1543 (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS).

@inproceedings{f3f6977725c548c7bcedf78dcfdc0280,

title = "Towards direct speech synthesis from ECoG: A pilot study",

abstract = "Most current Brain-Computer Interfaces (BCIs) achieve high information transfer rates using spelling paradigms based on stimulus-evoked potentials. Despite the success of this interfaces, this mode of communication can be cumbersome and unnatural. Direct synthesis of speech from neural activity represents a more natural mode of communication that would enable users to convey verbal messages in real-time. In this pilot study with one participant, we demonstrate that electrocoticography (ECoG) intracranial activity from temporal areas can be used to resynthesize speech in real-time. This is accomplished by reconstructing the audio magnitude spectrogram from neural activity and subsequently creating the audio waveform from these reconstructed spectrograms. We show that significant correlations between the original and reconstructed spectrograms and temporal waveforms can be achieved. While this pilot study uses audibly spoken speech for the models, it represents a first step towards speech synthesis from speech imagery.",

author = "Christian Herff and Garett Johnson and Lorenz Diener and Jerry Shih and Dean Krusienski and Tanja Schultz",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.; 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016 ; Conference date: 16-08-2016 Through 20-08-2016",

year = "2016",

month = oct,

day = "13",

doi = "10.1109/EMBC.2016.7591004",

language = "English (US)",

series = "Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "1540--1543",

booktitle = "2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016",

}

TY - GEN

T1 - Towards direct speech synthesis from ECoG

T2 - 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016

AU - Herff, Christian

AU - Johnson, Garett

AU - Diener, Lorenz

AU - Shih, Jerry

AU - Krusienski, Dean

AU - Schultz, Tanja

PY - 2016/10/13

Y1 - 2016/10/13

N2 - Most current Brain-Computer Interfaces (BCIs) achieve high information transfer rates using spelling paradigms based on stimulus-evoked potentials. Despite the success of this interfaces, this mode of communication can be cumbersome and unnatural. Direct synthesis of speech from neural activity represents a more natural mode of communication that would enable users to convey verbal messages in real-time. In this pilot study with one participant, we demonstrate that electrocoticography (ECoG) intracranial activity from temporal areas can be used to resynthesize speech in real-time. This is accomplished by reconstructing the audio magnitude spectrogram from neural activity and subsequently creating the audio waveform from these reconstructed spectrograms. We show that significant correlations between the original and reconstructed spectrograms and temporal waveforms can be achieved. While this pilot study uses audibly spoken speech for the models, it represents a first step towards speech synthesis from speech imagery.

AB - Most current Brain-Computer Interfaces (BCIs) achieve high information transfer rates using spelling paradigms based on stimulus-evoked potentials. Despite the success of this interfaces, this mode of communication can be cumbersome and unnatural. Direct synthesis of speech from neural activity represents a more natural mode of communication that would enable users to convey verbal messages in real-time. In this pilot study with one participant, we demonstrate that electrocoticography (ECoG) intracranial activity from temporal areas can be used to resynthesize speech in real-time. This is accomplished by reconstructing the audio magnitude spectrogram from neural activity and subsequently creating the audio waveform from these reconstructed spectrograms. We show that significant correlations between the original and reconstructed spectrograms and temporal waveforms can be achieved. While this pilot study uses audibly spoken speech for the models, it represents a first step towards speech synthesis from speech imagery.

UR - http://www.scopus.com/inward/record.url?scp=85009089427&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85009089427&partnerID=8YFLogxK

U2 - 10.1109/EMBC.2016.7591004

DO - 10.1109/EMBC.2016.7591004

M3 - Conference contribution

AN - SCOPUS:85009089427

T3 - Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS

SP - 1540

EP - 1543

BT - 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2016

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 16 August 2016 through 20 August 2016

ER -

Towards direct speech synthesis from ECoG: A pilot study

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this