Authors:
Jianfeng Xu
;
Yuki Nagai
;
Shinya Takayama
and
Shigeyuki Sakazawa
Affiliation:
Media and HTML5 Application Laboratory, KDDI R&D Laboratories and Inc., Japan
Keyword(s):
Conversational Agents, Multimodal Synchronization, Gesture, Motion Graphs, Dynamic Programming.
Related
Ontology
Subjects/Areas/Topics:
Agent Models and Architectures
;
Agents
;
Artificial Intelligence
;
Conversational Agents
;
Enterprise Information Systems
;
Human-Computer Interaction
;
Intelligent User Interfaces
;
Soft Computing
;
Vision and Perception
Abstract:
Multimodal representation of conversational agents requires accurate synchronization of gesture and speech. For this purpose, we investigate the important issues in synchronization as a practical guideline for our algorithm design through a precedent case study and propose a two-step synchronization approach. Our case study reveals that two issues (i.e. duration and timing) play an important role in the manual synchronizing of gesture with speech. Considering the synchronization problem as a motion synthesis problem instead of a behavior scheduling problem used in the conventional methods, we use a motion graph technique with constraints on gesture structure for coarse synchronization in a first step and refine this further by shifting and scaling the motion in a second step. This approach can successfully synchronize gesture and speech with respect to both duration and timing. We have confirmed that our system makes the creation of attractive content easier than manual creation of
equal quality. In addition, subjective evaluation has demonstrated that the proposed approach achieves more accurate synchronization and higher motion quality than the state-of-the-art method.
(More)