loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Oliver Strauß and Holger Kett

Affiliation: Fraunhofer Institute for Industrial Engineering IAO, Nobelstraße 12, 70569 Stuttgart, Germany

Keyword(s): Dataset Search, Agent-Based Retrieval, Semantic Search.

Abstract: Finding good representations for documents in the context of semantic search is a relevant problem with applications in domains like medicine, research or data search. In this paper we propose to represent each document in a search index by a number of different contextual embeddings. We define and evaluate eight different strategies to combine embeddings of document title, document passages and relevant user queries by means of linear combinations, averaging, and clustering. In addition we apply an agent-based approach to search whereby each data item is modeled as an agent that tries to optimize its metadata and presentation over time by incorporating information received via the users’ interactions with the search system. We validate the document representation strategies and the agent-based approach in the context of a medical information retrieval dataset and find that a linear combination of the title embedding, mean passage embedding and the mean over the clustered embeddings of relevant queries offers the best trade-off between search-performance and index size. We further find, that incorporating embeddings of relevant user queries can significantly improve the performance of representation strategies based on semantic embeddings. The agent-based system performs slightly better than the other representation strategies but comes with a larger index size. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.139.240.142

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Strauß, O. and Kett, H. (2023). Documents as Intelligent Agents: An Approach to Optimize Document Representations in Semantic Search. In Proceedings of the 19th International Conference on Web Information Systems and Technologies - WEBIST; ISBN 978-989-758-672-9; ISSN 2184-3252, SciTePress, pages 164-175. DOI: 10.5220/0012239200003584

@conference{webist23,
author={Oliver Strauß. and Holger Kett.},
title={Documents as Intelligent Agents: An Approach to Optimize Document Representations in Semantic Search},
booktitle={Proceedings of the 19th International Conference on Web Information Systems and Technologies - WEBIST},
year={2023},
pages={164-175},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012239200003584},
isbn={978-989-758-672-9},
issn={2184-3252},
}

TY - CONF

JO - Proceedings of the 19th International Conference on Web Information Systems and Technologies - WEBIST
TI - Documents as Intelligent Agents: An Approach to Optimize Document Representations in Semantic Search
SN - 978-989-758-672-9
IS - 2184-3252
AU - Strauß, O.
AU - Kett, H.
PY - 2023
SP - 164
EP - 175
DO - 10.5220/0012239200003584
PB - SciTePress