Speaking Digital Person Video Generation Methods Review Report Talking Head

Qinghua Yu

2025

Abstract

In recent years, the rapid development of deep learning technology has significantly promoted the progress of virtual digital human technology, especially in the field of speaking digital human video generation has made a significant breakthrough. The research on this technology has shown great potential and value in many application scenarios such as video translation, film production and virtual assistant. This paper systematically summarizes and summarizes the main methods and research progress of voice-driven speech, and discusses the key technologies, data set construction and evaluation strategies. At the key technical level, advanced artificial intelligence technologies such as Generative Adversarial Network (GAN), Diffusion Model (DM) and Neural Radiance Field (NeRF) play a central role. At the same time, the size and diversity of the dataset have a decisive impact on the effect of the model training, while the optimization of the evaluation strategy contributes to a more objective and comprehensive measurement of the quality of the generated results. Although the technology has made significant progress, there are still many challenges and opportunities. In the future, this field is expected to further promote technological development through continuous innovation and breakthrough, and bring more convenience and value to human society.

Download


Paper Citation


in Harvard Style

Yu Q. (2025). Speaking Digital Person Video Generation Methods Review Report Talking Head. In Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE; ISBN 978-989-758-765-8, SciTePress, pages 502-508. DOI: 10.5220/0013700200004670


in Bibtex Style

@conference{icdse25,
author={Qinghua Yu},
title={Speaking Digital Person Video Generation Methods Review Report Talking Head},
booktitle={Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE},
year={2025},
pages={502-508},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013700200004670},
isbn={978-989-758-765-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE
TI - Speaking Digital Person Video Generation Methods Review Report Talking Head
SN - 978-989-758-765-8
AU - Yu Q.
PY - 2025
SP - 502
EP - 508
DO - 10.5220/0013700200004670
PB - SciTePress