Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling
Tian Xie
2025
Abstract
With the advancement of artificial intelligence, speech synthesis technology has been widely applied across multiple fields. Deep learning-based speech synthesis has gained significant attention due to its ability to automatically learn complex acoustic features, greatly improving speech fluency and naturalness. This paper reviews deep learning-based speech synthesis technology, with a particular focus on its applications in falsetto and voice transformation tasks. By exploring the principles of human speech production and the development of speech synthesis technology, this paper analyzes the advantages and limitations of current deep learning models and proposes an innovative method that integrates articulatory organ parameters with acoustic parameters. Furthermore, the paper discusses the potential of airflow simulation in physical modeling, especially its application prospects in generating personalized voices and handling voice transformation and falsetto tasks. Finally, this paper outlines future research directions, including optimizing deep learning models, integrating physical modeling techniques, and fostering interdisciplinary research, aiming to advance speech synthesis technology towards greater personalization and richer emotional expression..
DownloadPaper Citation
in Harvard Style
Xie T. (2025). Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling. In Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE; ISBN 978-989-758-765-8, SciTePress, pages 581-585. DOI: 10.5220/0013702300004670
in Bibtex Style
@conference{icdse25,
author={Tian Xie},
title={Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling},
booktitle={Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE},
year={2025},
pages={581-585},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013702300004670},
isbn={978-989-758-765-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE
TI - Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling
SN - 978-989-758-765-8
AU - Xie T.
PY - 2025
SP - 581
EP - 585
DO - 10.5220/0013702300004670
PB - SciTePress