Learning Spatial Relations with a Standard Convolutional Neural Network

Kevin Swingler, Mandy Bath

2020

Abstract

This paper shows how a standard convolutional neural network (CNN) without recurrent connections is able to learn general spatial relationships between different objects in an image. A dataset was constructed by placing objects from the Fashion-MNIST dataset onto a larger canvas in various relational locations (for example, trousers left of a shirt, both above a bag). CNNs were trained to name the objects and their spatial relationship. Models were trained to perform two different types of task. The first was to name the objects and their relationships and the second was to answer relational questions such as “Where is the shoe in relation to the bag?”. The models performed at above 80% accuracy on test data. The models were also capable of generalising to spatial combinations that had been intentionally excluded from the training data.

Download


Paper Citation


in Harvard Style

Swingler K. and Bath M. (2020). Learning Spatial Relations with a Standard Convolutional Neural Network. In Proceedings of the 12th International Joint Conference on Computational Intelligence (IJCCI 2020) - Volume 1: NCTA; ISBN 978-989-758-475-6, SciTePress, pages 464-470. DOI: 10.5220/0010170204640470


in Bibtex Style

@conference{ncta20,
author={Kevin Swingler and Mandy Bath},
title={Learning Spatial Relations with a Standard Convolutional Neural Network},
booktitle={Proceedings of the 12th International Joint Conference on Computational Intelligence (IJCCI 2020) - Volume 1: NCTA},
year={2020},
pages={464-470},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010170204640470},
isbn={978-989-758-475-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Joint Conference on Computational Intelligence (IJCCI 2020) - Volume 1: NCTA
TI - Learning Spatial Relations with a Standard Convolutional Neural Network
SN - 978-989-758-475-6
AU - Swingler K.
AU - Bath M.
PY - 2020
SP - 464
EP - 470
DO - 10.5220/0010170204640470
PB - SciTePress