Human-like emotion recognition : multi-label learning from noisy labeled audio-visual expressive speech

Yelin Kim, Jeesun Kim

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

23 Citations (Scopus)

Abstract

![CDATA[To capture variation in categorical emotion recognition by human perceivers, we propose a multi-label learning and evaluation method that can employ the distribution of emotion labels generated by every human annotator. In contrast to the traditional accuracy-based performance measure for categorical emotion labels, our proposed learning and inference algorithms use cross entropy to directly compare human and machine emotion label distributions. Our audiovisual emotion recognition experiments demonstrate that emotion recognition can benefit from using a multi-label representation that fully uses both clear and ambiguous emotion data. Further, the results demonstrate that this emotion recognition system can (i) learn the distribution of human annotators directly; (ii) capture the humanlike label noise in emotion perception; and (iii) identify infrequent or uncommon emotional expression (such as frustration) from inconsistently labeled emotion data, which were often ignored in previous emotion recognition systems.]]
Original languageEnglish
Title of host publicationProceedings 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP): April 15-20, 2018, Calgary, Alberta, Canada
PublisherIEEE
Pages5104-5108
Number of pages5
ISBN (Print)9781538646588
DOIs
Publication statusPublished - 2018
EventICASSP (Conference) -
Duration: 15 Apr 2018 → …

Publication series

Name
ISSN (Print)2379-190X

Conference

ConferenceICASSP (Conference)
Period15/04/18 → …

Keywords

  • algorithms
  • emotion recognition

Fingerprint

Dive into the research topics of 'Human-like emotion recognition : multi-label learning from noisy labeled audio-visual expressive speech'. Together they form a unique fingerprint.

Cite this