Authors :
Sweta Bhagavatula, BS – University of Southern California; Ryan Cabeen, PhD – University of Southern California; Rachael Garner, BA – University of Southern California; Pedro Andrade, MSc – University of Eastern Finland; Mette Heiskanen, MSc – University of Eastern Finland; Noora Puhakka, PhD – University of Eastern Finland; Eppu Manninen, MSc – University of Eastern Finland; Tomi Paananen, PhD – University of Eastern Finland; Olli Gröhn, PhD – University of Eastern Finland; Xavier Ekolle Ndode-Ekane, PhD – University of Eastern Finland; Riikka Immonen, PhD – University of Eastern Finland; Robert Ciszek, MSc – University of Eastern Finland; David Wright, A/Prof – Department of Neuroscience, Monash University; Pablo Casillas-Espinosa, Dr. – Department of Neuroscience, Monash University; Matthew Hudson, Dr. – Department of Neuroscience, Monash University; Nigel Jones, A/Prof – Department of Neuroscience, Monash University; Emma Braine, Ms. – Department of Neuroscience, Monash University; Glenn Yamakawa, Dr. – Department of Neuroscience, Monash University; Sandy Shultz, Dr. – Department of Neuroscience, Monash University; Idrish Ali, Dr. – Department of Neuroscience, Monash University; Juliana Silva, Dr. – Department of Neuroscience, Monash University; Rhys Brady, Dr. – Department of Neuroscience, Monash University; Neil Harris, PhD – David Geffen School of Medicine at UCLA; Gregory Smith, PhD – David Geffen School of Medicine at UCLA; Cesar Santana-Gomez, PhD – David Geffen School of Medicine at UCLA; Richard Staba, PhD – David Geffen School of Medicine at UCLA; Terence O'Brien, Prof – Department of Neuroscience, Monash University; Asla Pitkänen, MD, PhD, DSc – University of Eastern Finland; Dominique Duncan, PhD – University of Southern California
Rationale: Traumatic brain injury is typically the result of an external force to the brain that can lead to several pathologies, including posttraumatic epilepsy (PTE). In order to better understand PTE, the Epilepsy Bioinformatics Study for Antiepileptogenic Therapy (EpiBioS4Rx) aims to identify epileptogenic biomarkers in a clinical cohort and preclinical model. Because preclinical data are acquired across various centers based in the United States, Europe, and Australia, EpiBioS4Rx requires robust normalization and harmonization processes to obtain statistically significant and generalizable results. This work describes the tools and procedures used to harmonize multi-site, multimodal imaging data acquired by EpiBioS4Rx with the hope that dissemination and public sharing can expedite other multi-site projects.
Methods: Three main harmonization processes were applied to the MRI scans across the sites: file format harmonization, naming convention harmonization, and diffusion tensor imaging (DTI) metrics harmonization. As the scans are uploaded, there is variation in the format that the data is stored due to heterogeneity of scanner type and output format. The Python tool, ParseBruker, was created to interpret the metadata file associated with Bruker images, so imaging data could be sorted based on the variations in sequences (
https://github.com/cabeen/parsebruker). Along with file format variation across sites, different naming conventions are also utilized. Currently, there is no standardized naming schema for rodent data and the community lacks publicly-available converters to maintain consistent conventions. Therefore, a bash script was created to harmonize the protocol names for all images in the study: dwi, mge, mtlow, mthigh, and rare. Another aspect that is harmonized are the DTI metrics, such as fractional anisotropy (FA). Every diffusion weighted image in the dataset has a FA value that is estimated for each voxel in that image. To harmonize the data, the QIT VolumeHarmonize module scales the mode of the values to one and generates a histogram that represents the whole image. This process is run at the level of individual animals and across sites to ensure that local and global effects are analyzed.
Results: The standardization of file formats, naming conventions, and DTI metrics are qualitatively assessed. Histograms were generated for all the individual rodents per site, and the Monash University example is shown in Figure 1, where there is a clear, visual decrease in variability after harmonization. For inter-site analysis, the individual scans were averaged to depict a histogram per site. In Figure 2, the sham and TBI cohort were analyzed separately, which depicts the same harmonization factor. This indicates that the analysis can be run at the level of individual animals.
Conclusions: Our results indicate that the harmonization processes were successful in normalizing aspects of the data post-acquisition. This allows for data to be more easily shared and analyzed for all researchers in the study. Building these data tools is extremely important as it increases the statistical power of the study and leads to more cohesive and robust analysis of the data.
Funding: R01NS111744, U54NS100064