Abstracts

Episemogpt: A Fine-tuned Large Language Model for Epileptogenic Zone Localization Based on Seizure Semiology with a Performance Comparable to Epileptologists

Abstract number : 3.02
Submission category : 1. Basic Mechanisms / 1B. Epileptogenesis of genetic epilepsies
Year : 2024
Submission ID : 200
Source : www.aesnet.org
Presentation date : 12/9/2024 12:00:00 AM
Published date :

Authors :
Presenting Author: Feng Liu, PhD – Stevens Institute of Technology

Shihao Yang, MS – Stevens Institute of Technology
Meng Jiao, MS – Stevens Institute of Technology
Yaxi Luo, BS – Stevens Institute of Technology
Neel Fotedar, MD – Case Western Reserve University
Felix Rosenow, MD – Goethe-University Frankfurt
Hai Sun, MD, PhD – Rutgers University

Rationale: Seizure semiology, which refers to the study of the clinical manifestations during a seizure episode, contains valuable information on inferring the epileptogenic zones (EZs). Given its descriptive nature and recent advances of large language models (LLMs), it is important to design a fine-tuned LLM specially for prediction of EZs by interpreting seizure semiology. It is also important to compare its performance against popular LLMs such as different versions fo ChatGPT and epileptologists.


Methods: In this study, the first fined-tuned LLM, termed as EpiSemoGPT, is introduced based on the Mistral-7b-instruct as the foundational LLM model. A total of 865 cases with descriptions of seizure semiology paired with validated EZs were derived from 189 publications. We used the training dataset of those semiology records and the corresponding EZs to fine-tune the foundational LLM to improve the prediction performance about the most likely EZs. To test the performance of the fine-tuned EpiSemoGPT, 100 well-defined cases are evaluated by analyzing the responses from the EpiSemoGPT and a panel of 5 epileptologists. The responses from EpiSemoGPT and epileptologists were graded based on the rectified reliability score (rRS) and regional accuracy rate (RAR). In addition, the performance of EpiSemoGPT is also compared with its backbone model Mistral-7b-instruct, as well as different versions of ChatGPT as the representative LLMs.

Results: The EpiSemoGPT can provide valuable presurgical evaluations by identifying the most likely EZs provided with the description of seizure semiology. For comparison between EpiSemoGPT and the panel of epileptologists, the RAR score achieved by EpiSemoGPT in each general region with zero-shot prompt is 53.57% for the frontal lobe, 75.00% for the temporal lobe, 57.89% for the occipital lobe, 62.50% for the parietal lobe, 55.56% for the insula cortex, and 0.00\% for the cingulate cortex.

Comparatively, the RAR score achieved by epileptologists is 64.83% for the frontal lobe, 52.22% for the temporal lobe, 60.00% for the occipital lobe, 42.50% for the parietal lobe, 46.00% for the insular cortex, and 8.57% for the cingulate cortex. The fine-tuned EpiSemoGPT outperformed its foundational LLM Mistral-7b-instruct and ChatGPT especially with EZs in the insular cortex.

Conclusions: EpiSemoGPT demonstrates comparable performance to epileptologists in EZs inference and its value in the presurgical assessment given the patient's seizure semiology. EpiSemoGPT outperformed epileptologists on interpreting seizure semiology with EZs originating from the temporal and parietal lobes as well as insula cortex, whereas epileptologists outperformed EpiSemoGPT in the frontal and occipital lobes as well as cingulate cortex. Its better performance than the foundational model showing the effectiveness of leveraging high-quality, highly domain-specific samples on fine-tuning of LLMs.

Funding: N/A

Basic Mechanisms