Validity of Tms-induced Speech Error Classification: a Consortium Interrater Reliability Study
Abstract number :
2.162
Submission category :
3. Neurophysiology / 3E. Brain Stimulation
Year :
2024
Submission ID :
1100
Source :
www.aesnet.org
Presentation date :
12/8/2024 12:00:00 AM
Published date :
Authors :
Presenting Author: Taylor Jones, BS – University of Tennessee
Fiona Baumer, MD – Stanford University
Clifford Calley, MD – Dell Children’s Hospital, Austin, TX
Hansel Greiner, MD – Cincinnati Children's Hospital Medical Center
Negar Noorizadeh, PhD – University of Tennessee Health Science Center, Le Bonheur Neuroscience Institute, Le Bonheur Children's Hospital
Marianne Kanaris, BS – University of California, San Francisco, San Francisco, CA
Jackie Varner, MS – Le Bonheur Children's Hospital
Brian Lundstrom, MD, PhD – Mayo Clinic
Mauricio Rodriguez, BS – Stanford University, Palo Alto, CA
Keith Starnes, MD – Mayo Clinic
Alexander Rotenberg, MD PhD – Boston Children's Hospital - Harvard Medical School
Phiroz E. Tarapore, MD – University of California, San Francisco, San Francisco, CA
Anuj Jayakar, MD – Nicklaus Children’s Hospital, Miami, FL
Melissa Tsuboyama, MD – Boston Children's Hospital
Shalini Narayana, PhD – University of Tennessee Health Science Center, Le Bonheur Neuroscience Institute, Le Bonheur Children's Hospital
Rationale: Language localization with Transcranial Magnetic Stimulation (TMS) depends upon the practioner’s ability to recognize and classify speech errors elicited by TMS during a naming task. Consensus guidelines for speech error classification in children have not been established. The lack of such guidelines is particularly problematic because TMS is being adopted by a growing number of pediatric epilepsy programs and is increasingly incorporated into epilepsy surgical evaluations. Our nationwide consortium of pediatric TMS providers thus aimed to identify current challenges in speech response interpretation by assessing the interrater reliability (IRR) for identifying and classifying TMS-induced speech errors.
Methods: Using stratified random sampling, a group of 37 patients with proportionate age, gender, intellectual ability, error type, and stimulated hemisphere representation was chosen from our 270-patient consortium database. A dataset of 306 video clips showing speech responses during TMS mapping was created. Eight reviewers independently classified each clip into one of the following categories: no error, speech arrest, performance error, semantic error, or muscle artifact. Classifications were evaluated for interrater reliability using Fleiss’ Kappa (k). Reliability was further evaluated by patient age group (5-10 years, 11-14 years, 15+ years), intellectual ability (full-scale IQ (FSIQ) below or above 70), and stimulated hemisphere (left or right).
Results: Overall, reviewers had a moderate level of agreement for the dataset (k= 0.59) with substantial agreement for delineating errors from non-errors (k= 0.68). Responses classified as speech arrest, no error, and semantic error had substantial agreement, while those classified as performance and muscle artifact had only fair agreement (Table 1). Responses from the youngest age group had the highest IRR and higher FSIQ group had a higher IRR than those from the lower FSIQ group. Agreement did not significantly differ with right vs. left hemisphere stimulation (Table 2). Lastly, 37% of the responses from the dataset had less than 80% concordance amongst reviewers.
Conclusions: Overall, our study found that reviewers showed excellent agreement in identifying non-errors, speech arrests, and semantic errors, likely reflecting the straightforward presentation of these errors. Conversely, reviewers exhibited more variability in interpreting speech as a performance error. The broad nuance of a performance error, as hesitations, stuttering, and phonological switches falling under this category can contribute to the discrepancy in reviewer interpretation. This highlights the need for additional categories and clearer characteristic attributions for such errors. Future research should explore the factors behind age and intellectual ability-related differences in IRR. By highlighting types of speech responses and clinical characteristics associated with lower IRR, this study revealed key clinical scenarios that must be addressed by consensus guidelines to ensure standardization of pediatric TMS language mapping across the country.
Funding: Pediatric Epilepsy Research Foundation
Neurophysiology