Introduction to the special issue on auditory-visual expressive speech and gesture in humans and machines

Jeesun Kim, Chris Davis, Gérard Bailly

Research output: Contribution to journalArticlepeer-review

Abstract

We speak to express ourselves. Sometimes words can capture what we mean; sometimes we mean more than can be said. This is where our visible gestures - those dynamic oscillations of our gaze, face, head, hand, arms and bodies – help. Not only do these co-verbal visual signals help express our intentions, attitudes and emotion, they also help us engage with our conversational partners to get our message across. Understanding how and when a message is supplemented, shaped and changed by auditory and visual signals is crucial for a science ultimately interesting in the correct interpretation of transmitted meaning. This special issue highlights research articles that explore co-verbal and nonverbal signals, a key topic in speech communication since these are crucial ingredients in the interpretation of meaning. That is, the meaning of speech is calibrated, augmented and even changed by co-verbal/speech behaviours and gestures including the talker's facial expression, eye-contact, gaze-direction, arm movements, hand gestures, body motion and orientation, posture, proximity, physical contact, and so on. Understanding expressive signals is a vital step for developing machines that can properly decipher intention and engage as social agents. The special issue is divided into three parts: Auditory-visual speech perception; Characterization and perception of auditory-visual prosody; Computer-generated auditory-visual speech. Below, we introduce these papers with a brief review of relevant issues and previous studies, when needed.
Original languageEnglish
Pages (from-to)63-67
Number of pages5
JournalSpeech Communication
Volume98
DOIs
Publication statusPublished - 2018

Keywords

  • machine learning
  • nonverbal communication
  • oral communication
  • speech and gesture
  • speech perception

Fingerprint

Dive into the research topics of 'Introduction to the special issue on auditory-visual expressive speech and gesture in humans and machines'. Together they form a unique fingerprint.

Cite this