Authors:
R. Del Chiaro
;
A Bagdanov
and
A. Del Bimbo
Affiliation:
Media Integration and Communication Center, University of Florence and Italy
Keyword(s):
Cultural Heritage, Computer Vision, Instance Recognition, Image Categorization, Webly-supervised Learning.
Related
Ontology
Subjects/Areas/Topics:
Applications and Services
;
Computer Vision, Visualization and Computer Graphics
;
Imaging for Cultural Heritage (Modeling/Simulation, Virtual Restoration)
Abstract:
This paper describes the NoisyArt dataset, a dataset designed to support research on webly-supervised recognition of artworks. The dataset consists of more than 90,000 images and in more than 3,000 webly-supervised classes, and a subset of 200 classes with verified test images. Candidate artworks are identified using publicly available metadata repositories, and images are automatically acquired using Google Image and Flickr search. Document embeddings are also provided for short descriptions of all artworks. NoisyArt is designed to support research on webly-supervised artwork instance recognition, zero-shot learning, and other approaches to visual recognition of cultural heritage objects. Baseline experimental results are given using pretrained Convolutional Neural Network (CNN) features and a shallow classifier architecture. Experiments are also performed using a variety of techniques for identifying and mitigating label noise in webly-supervised training data.