Investigating the role of familiar face and voice cues in speech processing in noise

Jeesun Kim, Sonya Karisma, Vincent Aubanel, Chris Davis

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

Abstract

The speech of a familiar talker is better recognized in noise than an unfamiliar one, suggesting that listeners access talkerspecific models to assist with degraded input. This study investigated whether a talker model could be accessed by presenting the face of a talker. In the experiment, participants were trained in recognizing three talkers' faces and voices to ceiling-level. Participants were then given a speech in noise recognition task consisting of four talker conditions: familiar face then familiar voice; unfamiliar face then familiar voice, familiar face then unfamiliar voice; and unfamiliar face then unfamiliar voice. A talker familiarity effect was found, i.e., speech perception was more accurate in the familiar face and familiar voice condition than all other ones. A familiar voice did not produce a talker familiarity effect when paired with an unfamiliar face. The familiar face and unfamiliar voice condition had the poorest performance, indicating that pairing a familiar face and unfamiliar voice had a disruptive effect. The results suggest that listeners develop a talker model that includes details of both the voice and the face; and that accessing this model can in some circumstances be wholly determined by face cues.
Original languageEnglish
Title of host publicationProceedings of INTERSPEECH 2018, 2-6 September 2018, Hyderabad, India
PublisherInternational Speech Communication Association
Pages2276-2279
Number of pages4
DOIs
Publication statusPublished - 2018
EventINTERSPEECH (Conference) -
Duration: 2 Sept 2018 → …

Publication series

Name
ISSN (Print)1990-9772

Conference

ConferenceINTERSPEECH (Conference)
Period2/09/18 → …

Keywords

  • face perception
  • noise
  • speech perception

Fingerprint

Dive into the research topics of 'Investigating the role of familiar face and voice cues in speech processing in noise'. Together they form a unique fingerprint.

Cite this