Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling

Tian Xie

doi:10.5220/0013702300004670

Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling

Tian Xie

2025

Abstract

With the advancement of artificial intelligence, speech synthesis technology has been widely applied across multiple fields. Deep learning-based speech synthesis has gained significant attention due to its ability to automatically learn complex acoustic features, greatly improving speech fluency and naturalness. This paper reviews deep learning-based speech synthesis technology, with a particular focus on its applications in falsetto and voice transformation tasks. By exploring the principles of human speech production and the development of speech synthesis technology, this paper analyzes the advantages and limitations of current deep learning models and proposes an innovative method that integrates articulatory organ parameters with acoustic parameters. Furthermore, the paper discusses the potential of airflow simulation in physical modeling, especially its application prospects in generating personalized voices and handling voice transformation and falsetto tasks. Finally, this paper outlines future research directions, including optimizing deep learning models, integrating physical modeling techniques, and fostering interdisciplinary research, aiming to advance speech synthesis technology towards greater personalization and richer emotional expression..

Download

Paper Citation

in Harvard Style

Xie T. (2025). Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling. In Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE; ISBN 978-989-758-765-8, SciTePress, pages 581-585. DOI: 10.5220/0013702300004670

in Bibtex Style

@conference{icdse25,
author={Tian Xie},
title={Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling},
booktitle={Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE},
year={2025},
pages={581-585},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013702300004670},
isbn={978-989-758-765-8},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Conference on Data Science and Engineering - Volume 1: ICDSE
TI - Analysis of Speech Synthesis Technology: From Deep Learning to Airflow Modeling
SN - 978-989-758-765-8
AU - Xie T.
PY - 2025
SP - 581
EP - 585
DO - 10.5220/0013702300004670
PB - SciTePress