Abstracts

EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation

Abstract number : 3.176
Submission category : 3. Neurophysiology / 3G. Computational Analysis & Modeling of EEG
Year : 2023
Submission ID : 1138
Source : www.aesnet.org
Presentation date : 12/4/2023 12:00:00 AM
Published date :

Authors :
Presenting Author: Jonathan Kim, BS – University of Illinois Urbana-Champaign

Danilo Bernardo, MD – University of California San Francisco

Rationale:
Traditional machine learning (ML) applications in EEG have been narrow in scope, concentrating on distinct phenomena across disparate temporal scales (from transient spikes in milliseconds to seizures lasting minutes) and spatial scales (from localized high-frequency oscillations to global sleep activity). This siloed approach limits the development of interpretable and trustworthy EEG ML models that exhibit multi-scale electrophysiological understanding and classification capabilities. Thus, we propose EEG-GPT, a unifying approach to EEG classification that leverages advances in large language models (LLM). Recently, LLMs have outperformed other deep learning models in multiple domains, particularly in classification tasks with limited data, an advantage suited to sparsely annotated EEG data. We hypothesized that LLM systems can provide accurate EEG classification and interpretation in a verifiable step-by-step manner.
Methods:
We developed EEG-GPT (generative pretrained transformer), a fine-tuned OpenAI base model, and utilize chain-of-thought reasoning (Wei et al.) with access to ML specialists for detection of spikes, seizures, and slowing, in addition to quantitative EEG (Figure 1). We evaluated its performance on the Temple University Abnormal EEG Corpus (TUAB) with 2329 subjects with total 2993 hrs annotated EEG data (Obeid et al.). We utilize a few-shot training using only 100 EEG (~2% of training data) for finetuning. We report AUROC on the standard TUAB evaluation set (553 records). We demonstrate the capabilities of EEG-GPT to provide step-by-step reasoning and use of EEG tools.

Results:
EEG-GPT demonstrated excellent performance on the TUAB evaluation dataset while utilizing only 2% of the TUAB training data: AUROC 0.87 (Figure 2). Despite using sparse training data, this is comparable to AUROC 0.90 achieved by convolutional neural networks (Wagh, N et al.). The following is an example conversational stream from EEG-GPT when asked to interpret if an EEG file is normal or abnormal.

Thought: I need to use the available tools to determine whether the EEG is normal or abnormal.

Action: Run convolutional neural network seizure detection agent. 

Observation: The seizure detector did not find any times with high chance of seizure.

Thought: I need to use the QEEG feature similarity tool.

Action: Generate QEEG features and compare to normative QEEG values.

Observation: QEEG features are within normative range with >99.5% similarity score.

         [… EEG-GPT goes on to access its EEG tools and parses their results…]

Final Answer: The EEG is normal, as there were no seizures detected, no spikes detected, no slowing detected, and the QEEG feature similarity indicated readings within normative range.



Conclusions:
EEG-GPT achieves excellent performance comparable to current state-of-the-art deep learning methods in classifying normal from abnormal EEG while utilizing only 2% of training data. Furthermore, it offers the distinct advantages of providing intermediate reasoning steps and coordinating specialist EEG tools in its operation, offering transparent and interpretable step-by-step verification, thereby promoting trustworthiness in clinical contexts.

Funding: None

Neurophysiology