Interpretation of Functional Consequences of Genetic Variants in Epilepsies Using Protein Structures
Abstract number :
2.377
Submission category :
12. Genetics / 12A. Human Studies
Year :
2018
Submission ID :
501300
Source :
www.aesnet.org
Presentation date :
12/2/2018 4:04:48 PM
Published date :
Nov 5, 2018, 18:00 PM
Authors :
Sumaiya Iqbal, Broad Institute of MIT and Harvard; Eduardo Perez-Palma, Cologne Center for Genomics, University of Cologne; Jakob Jespersen, Massachusetts General Hospital; Patrick May, University of Luxembourg; Henrike Heyne, Massachusetts General Hospit
Rationale: Genetic factors are associated with common and rare epilepsies. For monogenic forms of epilepsy, a single genetic variant can explain the phenotype. Genetic testing is now routinely applied in epilepsy patients; however, the interpretation of test results remains challenging and the majority of clinically identified variants are classified as ‘variants of uncertain significance (VUS)’. Computational approaches designed to stratify benign from pathogenic variants using large-scale genetic data from reference populations are gradually becoming successful. However, current methods do not leverage the three-dimensional (3D) protein structure context in their predictions and cannot predict the functional consequences of these variants. Incorporating the 3D protein structure into a variant prediction framework may elucidate the molecular cause of a disease, triggered by the alteration of a respective amino acid due to a genetic variation, and in some cases, could open a path for targeted drug development. Methods: We developed a framework for in silico mapping of genetic variants onto experimentally solved protein structures. Then, we annotated the genetic variation positions with a comprehensive set (N > 30) of protein structural features and computed 3D spatial-level enrichment statistics. In total, we characterized 492,755 and 32,924 single amino acid substitutions from general population (gnomAD database) and patients (ClinVar and HGMD databases), respectively, and mapped these onto >30k solved protein structures corresponding to 4,897 human genes. Results: We identified 12 known epilepsy genes for which we could map > 90% of the population and patient variants onto their 3D protein structures. These genes include 2 kinases (CDKL5, CDK13), 5 transporters (SLC2A1, CHRNB2, GABRB3, CHRNA4, HCN1), 2 oxidoreductases (ALDH7A1, PNPO), 1 enzyme modulator (CSTB), 1 phosphate (EPM2A), and 1 membrane traffic protein (SNAP25). Across all protein classes, the amino acid positions affected by epilepsy patient variants were found less exposed to solvent (Mann-Whitney U test, p < 1.9E-15). Moreover, patient variants were enriched on ligand-binding sites (Fisher exact test, p < 1.03E-6). In contrary, some features for patient variants were specific to individual protein classes. For example, the amino acids with pathogenic variants for the ‘kinase’ and ‘transporter’ protein classes tend to form different local 3D structures. Applying spatial-level enrichment analyses of patient versus population variants along 3D protein structure, we identified clusters of patient variants in regions that were depleted of benign population variants for the majority of the genes tested. Conclusions: We developed a framework to characterize and identify patient variant enriched protein sites and illustrated the utility of the framework in 12 epilepsy genes at the time of abstract submission. The framework will help in categorizing variants of uncertain significance, and will potentially guide biologists and chemists in variant selection for molecular research and clinicians for drug selection. Funding: Not applicable