TY - GEN
T1 - Multimodal speech animation from electromagnetic articulography data
AU - Gibert, Guillaume
AU - Attina, Virginie
AU - Tiede, Mark
AU - Bundgaard-Nielsen, Rikke
AU - Kroos, Christian
AU - Kasisopa, Benjawan
AU - Vatikiotis-Bateson, Eric
AU - Best, Catherine T.
PY - 2012
Y1 - 2012
N2 - ![CDATA[Virtual humans have become part of our everyday life (movies, internet, and computer games). Even though they are more and more realistic, their speech capabilities are, most of the time, limited and not coherent and/or not synchronous with the corresponding acoustic signal. We describe a method to convert a virtual human avatar (animated through key frames and interpolation) into a more naturalistic talking head. Speech-capabilities were added to the avatar using real speech production data. Electromagnetic articulography (EMA) data provided lip, jaw and tongue trajectories of a speaker involved in face to face communication. An articulatory model driving jaw, lip and tongue movements was built. Constraining the key frame values, a corresponding high definition tongue articulatory model was developed. The resulting avatar was able to produce visible and partly occluded facial speech movements coherent and synchronous with the acoustic signal.]]
AB - ![CDATA[Virtual humans have become part of our everyday life (movies, internet, and computer games). Even though they are more and more realistic, their speech capabilities are, most of the time, limited and not coherent and/or not synchronous with the corresponding acoustic signal. We describe a method to convert a virtual human avatar (animated through key frames and interpolation) into a more naturalistic talking head. Speech-capabilities were added to the avatar using real speech production data. Electromagnetic articulography (EMA) data provided lip, jaw and tongue trajectories of a speaker involved in face to face communication. An articulatory model driving jaw, lip and tongue movements was built. Constraining the key frame values, a corresponding high definition tongue articulatory model was developed. The resulting avatar was able to produce visible and partly occluded facial speech movements coherent and synchronous with the acoustic signal.]]
KW - avatars (virtual reality)
KW - speech synthesis
UR - http://handle.uws.edu.au:8081/1959.7/521451
UR - http://www.eurasip.org/Proceedings/Eusipco/Eusipco2012/Conference/index.html
M3 - Conference Paper
SP - 2807
EP - 2811
BT - Proceedings of 20th European Signal Processing Conference (EUSIPCO): Palace of the Parliament, August 27-31, 2012, Bucharest, Romania
PB - IEEE
T2 - European Signal Processing Conference
Y2 - 27 August 2012
ER -