Machine Learning Algorithms Applied to Phenotypic Data in the International SCN8A Registry Reveals Distinct Subgroups of Patients with Pathogenic Gain of Function Variants
Abstract number :
1.505
Submission category :
4. Clinical Epilepsy / 4A. Classification and Syndromes
Year :
2023
Submission ID :
1307
Source :
www.aesnet.org
Presentation date :
12/2/2023 12:00:00 AM
Published date :
Authors :
Presenting Author: Joshua Hack, MS – University of Arizona
Joshua Hack, MS – Data Scientist, BIO5 Institute, University of Arizona; John Schreiber, MD – Director, Associate Professor, Neurology, Epilepsy Program, Children's National; Joseph Watkins, PhD – Professor, Mathematics, University of Arizona
Rationale:
Distinguishing phenotypic sub-groups for individuals pathogenic variants underlying monogenic spectrum disorders is essential for developing precision therapies. Due to the wide variety of clinical presentations, individuals with gain-of-function (GOF) variants in the SCN8A gene have historically been viewed as an ordered spectrum ranging from mild to severe. The large variability among individuals in seizure profile, developmental delay, variant location and type, and comorbidities present enormous challenges for clinicians to identify disease endotypes within this wide disease spectrum.
Methods:
Towards the goals of improving prognosis and clinical management, we applied two machine learning algorithms to analyze genotypic and phenotypic data collected by the International SCN8A Patient Registry. Two approaches, supervised and unsupervised, were applied to determine whether there was statistical support for subgroups with relatively homogeneous clinical features. The supervised approach categorized individuals utilizing features with severity cutoffs that were determined by clinical conventions. The unsupervised approach used a data-driven strategy to independently identify subtypes without the bias of prior clinical interpretation.
Results:
Both approaches found statistical support for three distinct subgroups; however, the distinguishing features were not concordant between approaches. For example, the three subgroups were particularly discordant in the predictive value of the following features: age at onset, seizure freedom, and developmental quotient. The primary features considered in the supervised approach were age at seizure onset and developmental quotient while the unsupervised approach considered development quotient, age at seizure onset, motor seizures, and seizure freedom. Both models showed significant correlation between severity likelihood and electrophysiological score. We externally validated the highest performing model in each approach using six highly recurrent variants. We tested for correlations between the electrophysiological score and the group in which an individual was predicted.
Conclusions:
The discordance in the unsupervised approach prompted the hypothesis that the gain-of-function population is best characterized by three non-linear endotypes rather than a linear spectrum. These endotypes were characterized as developmental encephalopathy (DE), epileptic encephalopathy (EE), and developmental and epileptic encephalopathy (DEE) (Figure 1). This hypothesis was tested by comparing the proportion of each group reporting seizure onset prior to developmental delay and the average gap between onset of seizures and developmental delay. The results support the recharacterization of the SCN8A disease as three endotypes rather than a severity spectrum, which should provide more precise diagnosis, prognosis, and clinical treatment. Finally, confirmation bias that might lead clinicians to classify patients as "developmental and epileptic encephalopathy" neglects these other DE and EE phenotypes, that have distinct clinical features, prognosis, and response to treatment.
Funding: Neurocrine Biosciences, Shay Emma Hammer Research Foundation
Clinical Epilepsy