Knowledge Vault 3/8 - GTEC BCI & Neurotechnology Spring School 2024 - Day 1
Speech decoding and synthesis from intracranial signals
Dean Krusienski, Virginia Commonwealth University (USA)
<Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Llama 3:

graph LR classDef krusienski fill:#f9d4d4, font-weight:bold, font-size:14px; classDef ecog fill:#d4f9d4, font-weight:bold, font-size:14px; classDef seeg fill:#d4d4f9, font-weight:bold, font-size:14px; classDef decoding fill:#f9f9d4, font-weight:bold, font-size:14px; classDef future fill:#f9d4f9, font-weight:bold, font-size:14px; A[Dean Krusienski] --> B[Krusienski: speech decoding,
synthesis from brain signals. 1] A --> C[ECoG: cortical surface electrodes. 2] A --> D[sEEG: deeper penetrating electrodes. 2] A --> E[Gamma activity for speech decoding. 3] A --> F[Goal: brain-actuated speech synthesizer. 4] F --> G[Imagined speech challenges
without acoustic measurements. 4] A --> H[2016 study: ECoG during speech. 5] A --> I[Herff, Angrick decoded ECoG speech. 6] A --> J[sEEG decoding performance,
electrode positioning questions. 7] J --> K[Imagined speech decoding
feasibility with sEEG. 7] A --> L[Sadeghian: imagined speech
electrodes also relevant for
overt, mouthed speech. 8] A --> M[Models trained on overt speech
perform worse on imagined. 9] A --> N[Models trained on imagined speech
maintain overt, mouthed performance. 10] A --> O[sEEG: depth information, consistent
imagined speech activation. 11] A --> P[Transformer model: above-chance
accuracy on sEEG tasks. 12] A --> Q[Brain activity differences: cued,
spontaneous speech tasks. 13] Q --> R[Frontal activity higher in
spontaneous vs cued speech. 14] A --> S[sEEG insights into prosody,
affect, deeper structures. 15] A --> T[Short trials, overt speech
timing for imagined speech. 16] A --> U[Balanced accuracy for
classification evaluation. 17] A --> V[Short trials limit imagined
speech speed variations. 18] V --> W[Dynamic time warping for
imagined speech speed. 18] A --> X[Future: integrate sEEG with
high-resolution cortical recordings. 19] A --> Y[sEEG accesses deeper regions vs
micro-electrodes, micro-ECoG. 20] A --> Z[Multi-institution collaborations crucial
for unique datasets. 21] A --> AA[Data sharing, open science
important for progress. 22] A --> AB[Inner speech decoding challenging,
recent papers discuss approaches. 23] A --> AC[Limit session durations, counterbalance
trials for participant fatigue. 24] A --> AD[Decoding accuracy depends on
task, chance level performance. 25] A --> AE[Dynamic time warping, alignment
for imagined speech variations. 26] A --> AF[sEEG studies speech in
deeper inaccessible structures. 27] A --> AG[Integrate sEEG with micro-ECoG,
microelectrodes for comprehensive view. 28] A --> AH[Models must account for overt,
mouthed, imagined speech differences. 29] A --> AI[sEEG imagined speech research
advances BCI communication. 30] class A,B krusienski; class C,H,I ecog; class D,J,K,L,O,AF,AG seeg; class E,F,G,M,N,P,Q,R,S,T,U,V,W,AB,AD,AE,AH decoding; class X,Y,Z,AA,AC,AI future;

Resume:

1.-Dean Krusienski presented work on speech decoding and synthesis from intracranial signals, focusing on electrocorticography (ECoG) and stereotactic EEG (sEEG).

2.-ECoG involves electrodes implanted on the cortex surface, while sEEG uses deeper penetrating electrodes, offering advantages but less cortical sampling.

3.-Broadband gamma activity (70-200 Hz) is often used for speech decoding, showing robust correlations with sensory, motor, and cognitive processes.

4.-The goal is a brain-actuated speech synthesizer by decoding speech from brain activity, but imagined speech poses challenges without acoustic measurements.

5.-A 2016 study characterized ECoG activity during continuous speech, revealing spatial-temporal patterns in frontal, motor, and auditory regions.

6.-Studies by Herff and Angrick decoded and synthesized speech from ECoG using different approaches like unit selection and neural networks.

7.-Transitioning from ECoG to sEEG raises questions about decoding performance, electrode positioning, and imagined speech decoding feasibility.

8.-A study by Sadeghian found a nested hierarchy where imagined speech-relevant electrodes were also relevant for mouthed and overt speech.

9.-Decoding models trained on overt speech performed worse on mouthed speech and even more poorly on imagined speech.

10.-Models trained on imagined speech maintained performance on overt and mouthed speech, possibly due to the nested hierarchy of relevant features.

11.-sEEG allows studying depth information, with imagined speech showing more consistent activation across depths compared to overt speech.

12.-A transformer model using unlabeled sEEG signals achieved above-chance accuracy on behavior recognition, speech detection, and word classification tasks.

13.-Recent work examines brain activity differences between cued speech recitation, visual scene description, and free-response spontaneous speech tasks.

14.-Frontal activity remained higher throughout spontaneous speech tasks compared to cued recitation, which dropped after speech onset.

15.-Despite challenges competing with micro-ECoG arrays, sEEG may provide insights into prosody, affect, and deeper speech/language structures.

16.-The data was collected with short trials and timing based on overt speech, assuming similar durations for imagined speech.

17.-Balanced accuracy was used for classification evaluation, which estimates chance level performance based on class proportions.

18.-Potential variations in imagined speech speed were limited by using short trials, but could be addressed by dynamic time warping.

19.-The future of sEEG for speech decoding involves integrating high-resolution cortical recordings with unique information from deeper structures.

20.-While micro-electrodes and micro-ECoG arrays have advantages for speech envelope details, sEEG offers access to deeper brain regions.

21.-Collaborations across multiple institutions have been crucial for advancing this research using unique patient populations and recording opportunities.

22.-Data sharing and open science practices are important, with some datasets already publicly available and others planned for release.

23.-Inner speech decoding remains challenging, but recent papers discuss considerations, best practices, and potential applications for invasive and non-invasive approaches.

24.-Participant fatigue is mitigated by limiting session durations, counterbalancing trials, and leveraging patient engagement during hospital stays.

25.-Acceptable decoding accuracy depends on the specific task and chance level performance, which can be estimated using balanced accuracy metrics.

26.-Variations in imagined speech speed pose challenges, but can potentially be addressed through dynamic time warping or other alignment techniques.

27.-Stereotactic EEG provides unique opportunities to study speech-related activity in deeper structures inaccessible with surface recording arrays.

28.-Integrating sEEG with other modalities like micro-ECoG and microelectrodes may provide a more comprehensive view of speech and language networks.

29.-Decoding models must account for differences between overt, mouthed, and imagined speech representations to optimize performance and generalizability.

30.-Continued research on imagined speech decoding with sEEG can advance brain-computer interfaces for communication in individuals with speech impairments.

Knowledge Vault built byDavid Vivancos 2024