Multimodal and Spectral Degradation Effects on Speech and Emotion Recognition in Adults

Authors

  • Chantel Ritter MacEwan University

Abstract

Research has shown that individuals with severe hearing loss demonstrate considerable adaptation to hearing input from their cochlear implants (CIs), especially when implanted at younger ages. Despite these gains, hearing restoration with sensory prostheses does not match that of normal acoustic hearing. Limitations are especially apparent in complex listening situations. CIs retain important timing information, but discard fine pitch details that are informative to voice quality and music. We examined how speech and emotion recognition can be improved for CI listeners by the addition of informative multimodal (auditory and visual) cues. We created conditions that simulate the hearing experiences of CI listeners using a vocoder, which reduced the fine pitch information. In the unimodal auditory condition, hearing adult participants listened to sentence-length vocoded speech created with 4, 8, 16, and 32 bands that contained increasing amounts of spectral (pitch) detail, respectively. In the multimodal condition, the vocoded speech was superimposed to videos of the talker speaking the sentence. Our results show that listeners capitalized on informative visual cues that complemented the acoustic information and improved their speech and emotion recognition accuracies. The multimodal benefit was greater under the most difficult listening conditions; that is, 4 and 8 bands. Our findings also show that the addition of visual information benefited emotion recognition more greatly, where spectral degradation hampered the perception of important prosodic detail that cue emotion in voice. The present findings can be used to inform rehabilitative practices by incorporating informative multimodal cues to improve communication outcomes of CI listeners.

Discipline: Psychology Honours

Faculty Mentor: Dr. Tara Vongpaisal

Published

2017-05-15