Mediapi-RGB: Enabling Technological Breakthroughs in French Sign Language (LSF) Research Through an Extensive Video-Text Corpus

Yanis Ouakrim, Yanis Ouakrim, Hannah Bull, Michèle Gouiffès, Denis Beautemps, Thomas Hueber, Annelies Braffort

2024

Abstract

We introduce Mediapi-RGB, a new dataset of French Sign Language (LSF) along with the first LSF-to-French machine translation model. With 86 hours of video, it the largest LSF corpora with translation. The corpus consists of original content in French Sign Language produced by deaf journalists, and has subtitles in written French aligned to the signing. The current release of Mediapi-RGB is available at the Ortolang corpus repository (https://www.ortolang.fr/workspaces/mediapi-rgb), and can be used for academic research purposes. The test and validation sets contain 13 and 7 hours of video respectively. The training set contains 66 hours of video that will be released progressively until December 2024. Additionally, the current release contains skeleton keypoints, sign temporal segmentation, spatio-temporal features and subtitles for all the videos in the train, validation and test sets, as well as a suggested vocabulary of nouns for evaluation purposes. In addition, we present the results obtained on this corpus with the first LSF-to-French translation baseline to give an overview of the possibilities offered by this corpus of unprecedented caliber for LSF. Finally, we suggest potential technological and linguistic applications for this new video-text dataset.

Download


Paper Citation


in Harvard Style

Ouakrim Y., Bull H., Gouiffès M., Beautemps D., Hueber T. and Braffort A. (2024). Mediapi-RGB: Enabling Technological Breakthroughs in French Sign Language (LSF) Research Through an Extensive Video-Text Corpus. In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP; ISBN 978-989-758-679-8, SciTePress, pages 139-148. DOI: 10.5220/0012372600003660


in Bibtex Style

@conference{visapp24,
author={Yanis Ouakrim and Hannah Bull and Michèle Gouiffès and Denis Beautemps and Thomas Hueber and Annelies Braffort},
title={Mediapi-RGB: Enabling Technological Breakthroughs in French Sign Language (LSF) Research Through an Extensive Video-Text Corpus},
booktitle={Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP},
year={2024},
pages={139-148},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012372600003660},
isbn={978-989-758-679-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP
TI - Mediapi-RGB: Enabling Technological Breakthroughs in French Sign Language (LSF) Research Through an Extensive Video-Text Corpus
SN - 978-989-758-679-8
AU - Ouakrim Y.
AU - Bull H.
AU - Gouiffès M.
AU - Beautemps D.
AU - Hueber T.
AU - Braffort A.
PY - 2024
SP - 139
EP - 148
DO - 10.5220/0012372600003660
PB - SciTePress