Sequential Spatial Transformer Networks for Salient Object Classification

David Dembinsky, David Dembinsky, Fatemeh Azimi, Fatemeh Azimi, Federico Raue, Jörn Hees, Sebastian Palacio, Andreas Dengel, Andreas Dengel

2023

Abstract

The standard classification architectures are designed and trained for obtaining impressive performance on dedicated image classification datasets, which usually contain images with a single object located at the image center. However, their accuracy drops when this assumption is violated, e.g., if the target object is cluttered with background noise or if it is not centered. In this paper, we study salient object classification: a more realistic scenario where there are multiple object instances in the scene, and we are interested in classifying the image based on the label corresponding to the most salient object. Inspired by previous works on Reinforcement Learning and Spatial Transformer Networks, we propose a model equipped with a trainable focus mechanism, which improves classification accuracy. Our experiments on the PASCAL VOC dataset show that the method is capable of increasing the intersection-ver-union of the salient object, which improves the classification accuracy by 1.82 pp overall, and 3.63 pp for smaller objects. We provide an analysis of the failing cases, discussing different aspects such as dataset bias and saliency definition on the classification output.

Download


Paper Citation


in Harvard Style

Dembinsky D., Azimi F., Raue F., Hees J., Palacio S. and Dengel A. (2023). Sequential Spatial Transformer Networks for Salient Object Classification. In Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-626-2, pages 328-335. DOI: 10.5220/0011667100003411


in Bibtex Style

@conference{icpram23,
author={David Dembinsky and Fatemeh Azimi and Federico Raue and Jörn Hees and Sebastian Palacio and Andreas Dengel},
title={Sequential Spatial Transformer Networks for Salient Object Classification},
booktitle={Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2023},
pages={328-335},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011667100003411},
isbn={978-989-758-626-2},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Sequential Spatial Transformer Networks for Salient Object Classification
SN - 978-989-758-626-2
AU - Dembinsky D.
AU - Azimi F.
AU - Raue F.
AU - Hees J.
AU - Palacio S.
AU - Dengel A.
PY - 2023
SP - 328
EP - 335
DO - 10.5220/0011667100003411