Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets

Dávid Sztahó, Miklós Tulics, Jinzi Qi, Hugo Van Hamme, Klára Vicsi

2022

Abstract

Dysphonic voices can be detected using features derived from speech samples. Works aiming at this topic usually deal with mono-lingual experiments using a speech dataset in a single language. The present paper targets extension to a cross-lingual scenario. A Hungarian and a Dutch speech dataset are used. Automatic binary separation of normal and dysphonic speech and dysphonia severity level estimation are performed and evaluated by various metrics. Various speech features are calculated specific to an entire speech sample and to a given phoneme. Feature selection and model training is done on Hungarian and evaluated on the Dutch dataset. The results show that cross-lingual detection of dysphonic speech may be possible on the applied corpora. It was found that cross-lingual detection of dysphonic speech is indeed possible with acceptable generalization ability, while features calculated on phoneme-level parts of speech can improve the results. Considering cross-lingual classification test sets, 0.86 and 0.81 highest F1-scores can be achieved for feature sets with the vowel /E/ included and excluded, respectively and 0.72 and 0.65 highest Pearson correlations can be achieved or severity prediction using features sets with the vowel /E/ included and excluded, respectively.

Download


Paper Citation


in Harvard Style

Sztahó D., Tulics M., Qi J., Van Hamme H. and Vicsi K. (2022). Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: BIOSIGNALS, ISBN 978-989-758-552-4, pages 215-220. DOI: 10.5220/0010890200003123


in Bibtex Style

@conference{biosignals22,
author={Dávid Sztahó and Miklós Tulics and Jinzi Qi and Hugo Van Hamme and Klára Vicsi},
title={Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets},
booktitle={Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: BIOSIGNALS,},
year={2022},
pages={215-220},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010890200003123},
isbn={978-989-758-552-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 2: BIOSIGNALS,
TI - Cross-lingual Detection of Dysphonic Speech for Dutch and Hungarian Datasets
SN - 978-989-758-552-4
AU - Sztahó D.
AU - Tulics M.
AU - Qi J.
AU - Van Hamme H.
AU - Vicsi K.
PY - 2022
SP - 215
EP - 220
DO - 10.5220/0010890200003123