EAPC: Emotion and Audio Prior Control Framework for the Emotional and Temporal Talking Face Generation

Xuan-Nam Cao; Xuan-Nam Cao; Quoc-Huy Trinh; Quoc-Huy Trinh; Quoc-Anh Do-Nguyen; Quoc-Anh Do-Nguyen; Van-Son Ho; Van-Son Ho; Hoai-Thuong Dang; Hoai-Thuong Dang; Minh-Triet Tran; Minh-Triet Tran

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

EAPC: Emotion and Audio Prior Control Framework for the Emotional and Temporal Talking Face Generation

Topics: Agent Models and Architectures; AI and Creativity; Emotional Intelligence; Machine Learning

In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, 520-530, 2024 , Rome, Italy

Authors: Xuan-Nam Cao ^{1

;

2} ; Quoc-Huy Trinh ^{1

;

2} ; Quoc-Anh Do-Nguyen ^{1

;

2} ; Van-Son Ho ^{1

;

2} ; Hoai-Thuong Dang ^{1

;

2} and Minh-Triet Tran ^{1

;

2}

Affiliations: ¹ Faculty of Information Technology, University of Science, Ho Chi Minh City, Vietnam ; ² Vietnam National University, Ho Chi Minh City, Vietnam

Keyword(s): Landmark Generation, Talking Head, Dual-LSTM, Acoustic Features.

Abstract: Generating realistic talking faces from audio input is a challenging task with broad applications in fields such as film production, gaming, and virtual reality. Previous approaches, employing a two-stage process of converting audio to landmarks and then landmarks to a face, have shown promise in creating vivid videos. However, they still face challenges in maintaining consistency due to misconnections between information from the previous audio frame and the current audio frame, leading to the generation of unnatural landmarks. To address this issue, we propose EAPC, a framework that incorporates features from previous audio frames with the current audio feature and the current facial landmark. Additionally, we introduce the Dual-LSTM module to enhance emotion control. By doing so, our framework improves the temporal aspects and emotional information of the audio input, allowing our model to capture speech dynamics and produce more coherent animations. Extensive experiments demonstr ate that our method can generate consistent landmarks, resulting in more realistic and synchronized faces, leading to the achievement of our competitive results with state-of-the-art methods. The implementation of our method will be made publicly available upon publication. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.143.9.223

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Cao, X.; Trinh, Q.; Do-Nguyen, Q.; Ho, V.; Dang, H. and Tran, M. (2024). EAPC: Emotion and Audio Prior Control Framework for the Emotional and Temporal Talking Face Generation. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-680-4; ISSN 2184-433X, SciTePress, pages 520-530. DOI: 10.5220/0012455700003636

@conference{icaart24,
author={Xuan{-}Nam Cao. and Quoc{-}Huy Trinh. and Quoc{-}Anh Do{-}Nguyen. and Van{-}Son Ho. and Hoai{-}Thuong Dang. and Minh{-}Triet Tran.},
title={EAPC: Emotion and Audio Prior Control Framework for the Emotional and Temporal Talking Face Generation},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2024},
pages={520-530},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012455700003636},
isbn={978-989-758-680-4},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - EAPC: Emotion and Audio Prior Control Framework for the Emotional and Temporal Talking Face Generation
SN - 978-989-758-680-4
IS - 2184-433X
AU - Cao, X.
AU - Trinh, Q.
AU - Do-Nguyen, Q.
AU - Ho, V.
AU - Dang, H.
AU - Tran, M.
PY - 2024
SP - 520
EP - 530
DO - 10.5220/0012455700003636
PB - SciTePress