Automated Tool to Identify Rare Epilepsies in Electronic Medical Records
Abstract number :
2.304
Submission category :
11. Behavior/Neuropsychology/Language / 11B. Pediatrics
Year :
2021
Submission ID :
1826574
Source :
www.aesnet.org
Presentation date :
12/5/2021 12:00:00 PM
Published date :
Nov 22, 2021, 06:54 AM
Authors :
Kristen Barbour, MD - Weill Cornell Medicine; Elissa Yozawitz - Albert Einstein College of Medicine; Steven Wolf - Boston Children’s Health Physicians & New York Medical College; Patricia McGoldrick - Boston Children’s Health Physicians & New York Medical College; Tristan Sands - Columbia University; Aaron Nelson - NYU Langone Medical Center; Natasha Basma - Weill Cornell Medicine; Zachary Grinspan - Weill Cornell Medicine
Rationale: Rare epilepsies are a heterogenous group of epilepsies often associated with refractory seizures and developmental disabilities. Their rare occurrence makes it difficult to study and answer important questions related to disease course, comorbidities, and disease-specific treatments. Automated methods are needed to create larger size cohorts for research. The current study evaluates the performance of using disease-related keywords to identify rare epilepsies in text of clinical notes.
Methods: Data included text of clinical notes of patients with ICD-9 codes for seizures, epilepsy, or convulsions, from six hospitals in NYC. We used published disease-related keywords for 32 rare epilepsies (Grinspan et al., 2018; Epilepsia Open) and searched text of clinical notes for these words. We narrowed the list of keywords by removing ones with too few matches in clinical text (not useful) or too many matches (not specific enough). We tested performance by comparing keyword search to the gold standard of manual chart review. Two independent reviewers manually reviewed clinical notes to determine if the patient had the diagnosis (yes, no, uncertain). A third reviewer resolved disagreements. We defined patients as having a rare epilepsy if they had at least one disease-related keyword present in text of their clinical note. We measured true positive patients (TP), false positive patients (FP), and patients with uncertain diagnosis. For each rare epilepsy, we evaluated every potential combination of disease-related keywords to select the highest performing combination. For the highest performing combination, we measured TP, FP, uncertain patients, estimated false negative patients (eFN = TP for all keywords - TP for keyword combination), positive predictive value (PPV = TP / [TP + FP + Uncertain]), sensitivity (TP / [TP + eFN]), and F-score (2 X [PPV X sensitivity] / [PPV + sensitivity]).
Results: Data included clinical notes from 77,924 patients. Keyword search identified 5,842 patients potentially having a rare epilepsy. Manual chart review confirmed that 2,006 patients (34%) were true positive cases. When using all keywords in the search, PPVs were low with median = 0.29 and inter-quartile range (IQR) = 0.14 – 0.42. After selecting the optimal keyword combination, performance increased and included adequate PPVs (median = 0.64, IQR = 0.48 – 0.84), high sensitivity values (median = 0.96, IQR = 0.80 – 1.00), and adequate F-scores (median = 0.71, IQR = 0.62 – 0.88).
Conclusions: Keyword search is a feasible method to identify patients with rare epilepsies through automated analysis of large sets of clinical notes. We observed low PPVs when using all keywords. However, we saw higher performance when data was analyzed to identify the optimal combination of keywords. We created regular expressions for others to implement keyword search to identify patients with rare epilepsies.
Funding: Please list any funding that was received in support of this abstract.: CDC.
Behavior