loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Ahmed Azab 1 ; Ahmed Zaky 2 ; 3 ; Tetsuji Ogawa 4 and Walid Gomaa 5 ; 1

Affiliations: 1 Computer Science and Engineering, Egypt-Japan University of Science and Technology, Alexandria, Egypt ; 2 Computer Science and Information Technology Programs (CSIT), Egypt Japan University of Science and Technology, Egypt ; 3 Shoubra Faculty of Engineering, Benha University, Benha, Egypt ; 4 Department of Communications and Computer Engineering, Waseda University, Tokyo, Japan ; 5 Faculty of Engineering, Alexandria University, Alexandria, Egypt

Keyword(s): Natural Language Processing, Text-To-Speech, Egyptian Arabic.

Abstract: This paper presents the improvement and evaluation of Masry, an end-to-end system planned to synthesize Egyptian Arabic speech. The proposed approach leverages the capable Tacotron speech synthesis models, counting Tacotron1 and Tacotron2, and integrated with progressed vocoders – Griffin-Lim for Tacotron1 and HiFi-GAN for Tacotron2. By synthesizing waveforms from mel-spectrograms, Masry offers a comprehensive solution for generating natural and expressive Egyptian Arabic speech. To train and validate our system, we construct a dataset including a male speaker describing standard composing pieces and news content in Egyptian Arabic. The sampling rate of recorded data is 44100 Hz, guaranteeing constancy and richness within the synthesized speech output. The execution of our framework was fastidiously assessed through different measurements, with a specific center on the Mean Opinion Score (MOS). The experimental results demonstrated the prevalence of Tacotron2 over Tacotron1, yielding a MOS of 4.48 compared to 3.64. This emphasizes the system’s capacity to capture and duplicate the nuances of Egyptian Arabic speech more effectively. Besides, The assessment extended to include fundamental measurements such as word and character error rates (WER and CER). These metrics give a quantitative appraisal of the precision and exactness of the synthesized speech. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.145.74.54

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Azab, A.; Zaky, A.; Ogawa, T. and Gomaa, W. (2023). Masry: A Text-to-Speech System for the Egyptian Arabic. In Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO; ISBN 978-989-758-670-5; ISSN 2184-2809, SciTePress, pages 219-226. DOI: 10.5220/0012244300003543

@conference{icinco23,
author={Ahmed Azab. and Ahmed Zaky. and Tetsuji Ogawa. and Walid Gomaa.},
title={Masry: A Text-to-Speech System for the Egyptian Arabic},
booktitle={Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO},
year={2023},
pages={219-226},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012244300003543},
isbn={978-989-758-670-5},
issn={2184-2809},
}

TY - CONF

JO - Proceedings of the 20th International Conference on Informatics in Control, Automation and Robotics - Volume 2: ICINCO
TI - Masry: A Text-to-Speech System for the Egyptian Arabic
SN - 978-989-758-670-5
IS - 2184-2809
AU - Azab, A.
AU - Zaky, A.
AU - Ogawa, T.
AU - Gomaa, W.
PY - 2023
SP - 219
EP - 226
DO - 10.5220/0012244300003543
PB - SciTePress