Abstracts

Prediction of Epilepsy Using Machine Learning on Integrated Polygenic Risk Scores and Perinatal Data: A Three-Generation Birth Cohort Study

Abstract number : 3.098
Submission category : 12. Genetics / 12A. Human Studies
Year : 2025
Submission ID : 816
Source : www.aesnet.org
Presentation date : 12/8/2025 12:00:00 AM
Published date :

Authors :
Presenting Author: Takafumi Kubota, MD – Tohoku University Graduate School of Medicine

Hisashi Ohseto, MD, PhD – Tohoku University Graduate School of Medicine
Yuki Kashiwada, MS – Tohoku University Graduate School of Medicine
Taku Obara, PhD – Tohoku University Graduate School of Medicine
Kazutoshi Konomatsu, MD, PhD – Tohoku University Graduate School of Medicine
Masashi Aoki, MD, PhD – Tohoku University Graduate School of Medicine
Kazutaka Jin, MD, PhD – Tohoku University Graduate School of Medicine

Rationale: Early identification of infants at high risk for epilepsy could help prevent additional acquired epilepsy risk factors and enable earlier diagnosis to avert developmental deterioration. Although both genetic and perinatal factors contribute to epilepsy, their combined predictive use remains limited. We therefore developed a birth-cohort-based machine learning (ML) model that integrates polygenic risk scores (PRS) with perinatal data to predict epilepsy from birth.

Methods: Neonates enrolled in the Tohoku Medical Megabank Three-Generation Cohort Study were included, excluding those with congenital brain abnormalities and chromosomal disorders. 143 explanatory variables, including perinatal clinical information (e.g., maternal lipid levels, birth weight, delivery complication) and PRS for epilepsy, were used. The outcome was defined as epilepsy occurring between 6 months and 11 years of age. Predictive performance such as The Receiver Operating Characteristic Area Under the Curve (ROC AUC) was evaluated by five-fold cross-validation using LightGBM, Random Forest, XGBoost, and Logistic Regression models. The importance of each variable was assessed with the SHapley Additive exPlanation (SHAP) method. This study received approval from the Ethics Committee of the Tohoku University Graduate School of Medicine.

Results: Among 23,020 participants, 69 (0.3%) developed epilepsy. The ROC AUC for each model in predicting epilepsy was: XGBoost: 0.64, Random Forest: 0.57, LightGBM: 0.56 and Logistic Regression: 0.53. In the XGBoost model, the five most important predictors (mean absolute SHAP values) were: maternal total cholesterol (1.26), maternal triglyceride level (0.77), maternal history of epilepsy (0.53), birth weight (0.32), and maternal genistein level (0.32).

Conclusions: Our findings suggest that an ML-based approach integrating PRS and clinical data can modestly predict epilepsy risk from the perinatal stage. Although predictive accuracy remains limited, this approach could still aid in the preliminary identification of high-risk infants.

Funding:

This research was supported by 2024 Onuma Award, Japan.



Genetics