Entity Resolution in Large Patent Databases: An Optimization Approach

Emiel Caron, Ekaterini Ioannou

2021

Abstract

Entity resolution in databases focuses on detecting and merging entities that refer to the same real-world object. Collective resolution is among the most prominent mechanisms suggested to address this challenge since the resolution decisions are not made independently, but are based on the available relationships within the data. In this paper, we introduce a novel resolution approach that combines the essence of collective resolution with rules and transformations among entity attributes and values. We illustrate how the approach’s parameters are optimized based on a global optimization algorithm, i.e., simulated annealing, and explain how this optimization is performed using a small training set. The quality of the approach is verified through an extensive experimental evaluation with 40M real-world scientific entities from the Patstat database.

Download


Paper Citation


in Harvard Style

Caron E. and Ioannou E. (2021). Entity Resolution in Large Patent Databases: An Optimization Approach. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-509-8, pages 148-156. DOI: 10.5220/0010527501480156


in Bibtex Style

@conference{iceis21,
author={Emiel Caron and Ekaterini Ioannou},
title={Entity Resolution in Large Patent Databases: An Optimization Approach},
booktitle={Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2021},
pages={148-156},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010527501480156},
isbn={978-989-758-509-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Entity Resolution in Large Patent Databases: An Optimization Approach
SN - 978-989-758-509-8
AU - Caron E.
AU - Ioannou E.
PY - 2021
SP - 148
EP - 156
DO - 10.5220/0010527501480156