Few-shot Linguistic Grounding of Visual Attributes and Relations using Gaussian Kernels

Daniel Koudouna; Kasim Terzić

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Few-shot Linguistic Grounding of Visual Attributes and Relations using Gaussian Kernels

Topics: Categorization and Scene Understanding; Cognitive Models for Interpretation, Integration and Control; Few Shot Learning; Machine Learning Technologies for Vision

In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5 VISAPP: VISAPP, 146-156, 2021

Authors: Daniel Koudouna and Kasim Terzić

Affiliation: School of Computer Science, University of St Andrews, U.K.

Keyword(s): Few-shot Learning, Learning Models, Attribute Learning, Relation Learning, Scene Understanding.

Abstract: Understanding complex visual scenes is one of fundamental problems in computer vision, but learning in this domain is challenging due to the inherent richness of the visual world and the vast number of possible scene configurations. Current state of the art approaches to scene understanding often employ deep networks which require large and densely annotated datasets. This goes against the seemingly intuitive learning abilities of humans and our ability to generalise from few examples to unseen situations. In this paper, we propose a unified framework for learning visual representation of words denoting attributes such as “blue” and relations such as “left of” based on Gaussian models operating in a simple, unified feature space. The strength of our model is that it only requires a small number of weak annotations and is able to generalize easily to unseen situations such as recognizing object relations in unusual configurations. We demonstrate the effectiveness of our model on the p redicate detection task. Our model is able to outperform the state of the art on this task in both the normal and zero-shot scenarios, while training on a dataset an order of magnitude smaller. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 18.216.239.46

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Koudouna, D. and Terzić, K. (2021). Few-shot Linguistic Grounding of Visual Attributes and Relations using Gaussian Kernels. In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP; ISBN 978-989-758-488-6; ISSN 2184-4321, SciTePress, pages 146-156. DOI: 10.5220/0010261301460156

@conference{visapp21,
author={Daniel Koudouna. and Kasim Terzić.},
title={Few-shot Linguistic Grounding of Visual Attributes and Relations using Gaussian Kernels},
booktitle={Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP},
year={2021},
pages={146-156},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010261301460156},
isbn={978-989-758-488-6},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP
TI - Few-shot Linguistic Grounding of Visual Attributes and Relations using Gaussian Kernels
SN - 978-989-758-488-6
IS - 2184-4321
AU - Koudouna, D.
AU - Terzić, K.
PY - 2021
SP - 146
EP - 156
DO - 10.5220/0010261301460156
PB - SciTePress