Multi-modal Label Retrieval for the Visual Arts: The Case of Iconclass

Nikolay Banar; Nikolay Banar; Walter Daelemans; Mike Kestemont; Mike Kestemont

doi:10.5220/0010390606220629

Multi-modal Label Retrieval for the Visual Arts: The Case of Iconclass

Nikolay Banar, Nikolay Banar, Walter Daelemans, Mike Kestemont, Mike Kestemont

2021

Abstract

Iconclass is an iconographic classification system from the domain of cultural heritage which is used to annotate subjects represented in the visual arts. In this work, we investigate the feasibility of automatically assigning Iconclass codes to visual artworks using a cross-modal retrieval set-up. We explore the text and image branches of the cross-modal network. In addition, we describe a multi-modal architecture that can jointly capitalize on multiple feature sources: textual features, coming from the titles for these artworks (in multiple languages) and visual features, extracted from photographic reproductions of the artworks. We utilize Iconclass definitions in English as matching labels. We evaluate our approach on a publicly available dataset of artworks (containing English and Dutch titles). Our results demonstrate that, in isolation, textual features strongly outperform visual features, although visual features can still offer a useful complement to purely linguistic features. Moreover, we show the cross-lingual (Dutch-English) strategy to be on par with the monolingual approach (English-English), which opens important perspectives for applications of this approach beyond resource-rich languages.

Download

Paper Citation

in Harvard Style

Banar N., Daelemans W. and Kestemont M. (2021). Multi-modal Label Retrieval for the Visual Arts: The Case of Iconclass.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH, ISBN 978-989-758-484-8, pages 622-629. DOI: 10.5220/0010390606220629

in Bibtex Style

@conference{artidigh21,
author={Nikolay Banar and Walter Daelemans and Mike Kestemont},
title={Multi-modal Label Retrieval for the Visual Arts: The Case of Iconclass},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH,},
year={2021},
pages={622-629},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010390606220629},
isbn={978-989-758-484-8},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH,
TI - Multi-modal Label Retrieval for the Visual Arts: The Case of Iconclass
SN - 978-989-758-484-8
AU - Banar N.
AU - Daelemans W.
AU - Kestemont M.
PY - 2021
SP - 622
EP - 629
DO - 10.5220/0010390606220629