Abstracts

Validation of the International League Against Epilepsy (ILAE) Risk of Bias Tool Against the Newcastle-Ottawa Scale in Epilepsy Research

Abstract number : 2.125
Submission category : 16. Epidemiology
Year : 2025
Submission ID : 1001
Source : www.aesnet.org
Presentation date : 12/7/2025 12:00:00 AM
Published date :

Authors :
Presenting Author: Churl-Su Kwon, M.D., M.P.H., FRSPH. – Columbia University Irving Medical Center

Ali Rafati, MD, MPH – Johns Hopkins University
Nathalie Jette, MSc, MD, FRCPC – Alberta Health Services, Cumming School of Medicine, University of Calgary
Charles Newton, MD, FRCPCH – University of Oxford

Rationale:

Assessing study quality and risk of bias in observational epilepsy research is critical for the validity of systematic reviews and meta-analyses (SRMAs). While the Newcastle-Ottawa Scale (NOS) is widely used, it lacks epilepsy-specific criteria. The International League Against Epilepsy (ILAE) Commission on Epidemiology recently developed a novel tool tailored to epilepsy research. This study aims to evaluate the reliability and validity of the ILAE tool compared to NOS.



Methods:

We evaluated 43 observational studies drawn from three published SRMAs on psychiatric comorbidities in people with epilepsy (PwE). Each study was assessed independently by two raters using both the NOS (0–9 scoring) and the ILAE tool (ordinal rating: High, Good, Fair, Poor). The NOS, serving as the benchmark, evaluates three domains: selection, comparability, and outcome/exposure assessment. It produces a score from 0 to 9. We categorized NOS scores as High (7–9), Moderate (4–6), and Low (0–3), consistent with already-established thresholds. The ILAE tool comprises six domains specifically tailored to epilepsy epidemiology: (1) Source of Study Population; (2) Completeness (Sensitivity) of Epilepsy Case-Finding; (3) Sensitivity of Comorbidity Determination; (4) Accuracy of Epilepsy Diagnosis; (5) Accuracy of Comorbidity Diagnosis; and (6) Representativeness of Study Sample. Final ILAE judgments were collapsed into three levels: High, Good, Fair, and Poor. The lowest-scoring domain determined the overall rating. Interrater reliability was calculated using intraclass correlation coefficients (ICC). Concordance between NOS and ILAE ratings was assessed with Spearman’s rank correlation and Cohen’s weighted kappa. Bland-Altman analysis evaluated systematic scoring differences.



Results:

The NOS and ILAE tools demonstrated strong interrater reliability (ICC = 0.72, 95% CI: 0.56–0.82). There was a very strong correlation between tools (Spearman’s rho = 0.81, p< 0.001-Figure 1), and substantial agreement on binary quality ratings (Cohen’s weighted kappa = 0.71, p< 0.001). Bland-Altman analysis showed minimal systematic difference in scoring (mean difference = 0.08-Figure 2), although NOS tended to slightly overestimate study quality.

Epidemiology