Inferring Interpretable Semantic Cognitive Maps from Noisy Document Corpora

Yahya Emara, Tristan Weger, Ryan Rubadue, Rishabh Choudhary, Simona Doboli, Ali Minai

2024

Abstract

With the emergence of deep learning-based semantic embedding models, it has become possible to extract large-scale semantic spaces from text corpora. Semantic elements such as words, sentences and documents can be represented as embedding vectors in these spaces, allowing their use in many applications. However, these semantic spaces are very high-dimensional and the embedding vectors are hard to interpret for humans. In this paper, we demonstrate a method for obtaining more meaningful, lower-dimensional semantic spaces, or cognitive maps, through the semantic clustering of the high-dimensional embedding vectors obtained from a real-world corpus. A key limitation in this is the presence of semantic noise in real-world document corpora. We show that pre-filtering the documents for semantic relevance can alleviate this problem, and lead to highly interpretable cognitive maps.

Download


Paper Citation


in Harvard Style

Emara Y., Weger T., Rubadue R., Choudhary R., Doboli S. and Minai A. (2024). Inferring Interpretable Semantic Cognitive Maps from Noisy Document Corpora. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 742-749. DOI: 10.5220/0012389100003636


in Bibtex Style

@conference{icaart24,
author={Yahya Emara and Tristan Weger and Ryan Rubadue and Rishabh Choudhary and Simona Doboli and Ali Minai},
title={Inferring Interpretable Semantic Cognitive Maps from Noisy Document Corpora},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={742-749},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012389100003636},
isbn={978-989-758-680-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Inferring Interpretable Semantic Cognitive Maps from Noisy Document Corpora
SN - 978-989-758-680-4
AU - Emara Y.
AU - Weger T.
AU - Rubadue R.
AU - Choudhary R.
AU - Doboli S.
AU - Minai A.
PY - 2024
SP - 742
EP - 749
DO - 10.5220/0012389100003636
PB - SciTePress