loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Julia Böhlke 1 ; Dimitri Korsch 1 ; Paul Bodesheim 1 and Joachim Denzler 1 ; 2 ; 3

Affiliations: 1 Computer Vision Group, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, Jena, Germany ; 2 Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR), Institute for Data Science (IDW), Mälzerstraße 3, Jena, Germany ; 3 Michael Stifel Center Jena for Data-Driven and Simulation Science, Ernst-Abbe-Platz 2, Jena, Germany

Keyword(s): Noisy Web Data, Label Noise Filtering, Fine-grained Categorization, Duplicate Detection.

Abstract: Despite the availability of huge annotated benchmark datasets and the potential of transfer learning, i.e., fine-tuning a pre-trained neural network to a specific task, deep learning struggles in applications where no labeled datasets of sufficient size exist. This issue affects fine-grained recognition tasks the most since correct image data annotations are expensive and require expert knowledge. Nevertheless, the Internet offers a lot of weakly annotated images. In contrast to existing work, we suggest a new lightweight filtering strategy to exploit this source of information without supervision and minimal additional costs. Our main contributions are specific filter operations that allow the selection of downloaded images to augment a training set. We filter test duplicates to avoid a biased evaluation of the methods, and two types of label noise: cross-domain noise, i.e., images outside any class in the dataset, and cross-class noise, a form of label-swapping noise. We evaluate o ur suggested filter operations in a controlled environment and demonstrate our methods’ effectiveness with two small annotated seed datasets for moth species recognition. While noisy web images consistently improve classification accuracies, our filtering methods retain a fraction of the data such that high accuracies are achieved with a significantly smaller training dataset. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.97.14.84

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Böhlke, J. ; Korsch, D. ; Bodesheim, P. and Denzler, J. (2021). Lightweight Filtering of Noisy Web Data: Augmenting Fine-grained Datasets with Selected Internet Images. In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP; ISBN 978-989-758-488-6; ISSN 2184-4321, SciTePress, pages 466-477. DOI: 10.5220/0010244704660477

@conference{visapp21,
author={Julia Böhlke and Dimitri Korsch and Paul Bodesheim and Joachim Denzler},
title={Lightweight Filtering of Noisy Web Data: Augmenting Fine-grained Datasets with Selected Internet Images},
booktitle={Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP},
year={2021},
pages={466-477},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010244704660477},
isbn={978-989-758-488-6},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP
TI - Lightweight Filtering of Noisy Web Data: Augmenting Fine-grained Datasets with Selected Internet Images
SN - 978-989-758-488-6
IS - 2184-4321
AU - Böhlke, J.
AU - Korsch, D.
AU - Bodesheim, P.
AU - Denzler, J.
PY - 2021
SP - 466
EP - 477
DO - 10.5220/0010244704660477
PB - SciTePress