A Cloud-based Platform for Collaborative Research with Massive Neurophysiology Data: The Brain Data Science Platform (BDSP)
Abstract number :
2.078
Submission category :
3. Neurophysiology / 3G. Computational Analysis & Modeling of EEG
Year :
2022
Submission ID :
2205110
Source :
www.aesnet.org
Presentation date :
12/4/2022 12:00:00 PM
Published date :
Nov 22, 2022, 05:28 AM
Authors :
Valdery Moura Junior, MS, MBA – Massachusetts General Hospital, Harvard Medical School; Feng Li, PhD – Management – Bayes Business School, City, University of London; Aaron Struck, MD – Neurology – University of Wisconsin Madison; Edilberto Amorim, MD – University of California, San Francisco; Umakanth Katwa, MD – Sleep Medicine – Boston Children’s Hospital; Gari Clifford, DPhil – Emory University School of Medicine; Emmanuel Mignot, MD, PhD – Stanford University; Robert Thomas, MD – Beth Israel Deaconess Medical Center; Brandon Westover, MD, PhD – Neurology – Massachusetts General Hospital, Harvard Medical School
This abstract has been invited to present during the Neurophysiology platform session.
Rationale: Data-driven discovery and artificial intelligence (AI) modeling are increasingly recognized promising paths to advancing medicine, particularly subfields of neurology that make use of neurophysiological data. However, AI approaches have so far had limited impact on the fundamental science and real-world practice of clinical neurophysiological subspecialties including sleep medicine, epilepsy, and critical care electroencephalography (EEG). Previous studies have (1) utilized small datasets, limiting generalizability; (2) focused on single AI tasks, e.g., seizure detection or sleep staging; (3) merely attempted to replicate human pattern recognition, with indirect correlations with key patient-centric outcomes. We aim to develop a scalable, centralized, collaborative, secure cloud platform for researchers to access a wealth of neurophysiological data, perform data analysis, run analytic tools, develop new algorithms, share scientific research results, and to collaborate. _x000D_
Methods: Between January 2021 and June 2022, Massachusetts General Hospital, in collaboration with multiple institutions (Boston Children’s Hospital, Stanford, Beth Israel Deaconess Medical Center, University of California San Francisco, University of Wisconsin) assembled a team of software and data engineers led by a data technology scientist and a physician-scientist to develop The McCance Brain Data Science Platform (BDSP). The team conceptualized the BDSP in 3 layers: (1) Data Ingestion: Pipelines to acquire raw data from multiple sources - internal and external, including clinical data (demographics, diagnosis, notes, medications), neurophysiological (e.g., polysomnography – PSG, electroencephalograms – EEG), Electrocardiograms (ECG), Telemetry (e.g., vitals signs), brain imaging (MRI), etc (Figure 1). (2) Data Transformation: an ecosystem/infrastructure that converts, normalizes, matches, and de-identifies data. (3) Data Analytics and Visualization: a new data sharing paradigm that enhances collaboration by external contributors (Figure 2).
Results: The BDSP team is creating a cloud-based platform of unprecedented size. It currently includes 26,957 PSGs representing 15 years of data acquisition and 18,690 unique patients in a vendor agnostic format (.hdf5). By end of summer 2022, BDSP will include >180,000 EEG recordings, representing 20 years of EEGs from 4 major academic institutions, and public-accessible workspaces containing code from >10 publications to allow users to replicate research results. As next steps, the BDSP team is working to adapt the de-identification pipeline for general use at external sites, launch a contest for ICU seizure detection, create user-friendly methods for external sites to contribute data, and open the platform to the public.
Conclusions: The BDSP is a cloud-based platform for collaborative research using large-scale neurophysiological data. BDSP will enable research teams to apply cutting-edge AI and data-driven discovery methods to answer previously unaddressed questions of major public health impact. _x000D_
Funding: NIH (R01NS102190, R01NS102574, R01NS107291, RF1AG064312, R01AG062989, R01AG073410)
Neurophysiology