Control of speech-related facial movements of an avatar from video

Guillaume Gibert, Yvonne Leung, Catherine J. Stevens

    Research output: Contribution to journalArticlepeer-review

    3 Citations (Scopus)

    Abstract

    Several puppetry techniques have been recently proposed to transfer emotional facial expressions to an avatar from a user's video. Whereas generation of facial expressions may not be sensitive to small tracking errors, generation of speech-related facial movements would be severely impaired. Since incongruent facial movements can drastically influence speech perception, we proposed a more effective method to transfer speech-related facial movements from a user to an avatar. After a facial tracking phase, speech articulatory parameters (controlling the jaw and the lips) were determined from the set of landmark positions. Two additional processes calculated the articulatory parameters which controlled the eyelids and the tongue from the 2D Discrete Cosine Transform coefficients of the eyes and inner mouth images. A speech in noise perception experiment was conducted on 25 participants to evaluate the system. Increase in intelligibility was shown for the avatar and human auditory visual conditions compared to the avatar and human auditory-only conditions, respectively. Depending on the vocalic context, the results of the avatar auditory visual presentation were different: all the consonants were better perceived in /a/ vocalic context compared to /i/ and /u/ because of the lack of depth information retrieved from video. This method could be used to accurately animate avatars for hearing impaired people using information technologies and telecommunication.
    Original languageEnglish
    Pages (from-to)135-146
    Number of pages12
    JournalSpeech Communication
    Volume55
    Issue number1
    DOIs
    Publication statusPublished - 2013

    Keywords

    • auditory, visual speech
    • expression
    • face tracking
    • facial animation
    • puppetry
    • talking head

    Fingerprint

    Dive into the research topics of 'Control of speech-related facial movements of an avatar from video'. Together they form a unique fingerprint.

    Cite this