loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Laura Pandolfo and Luca Pulina

Affiliation: Intelligent System DEsign and Applications (IDEA) Lab, University of Sassari, via Muroni 23A, 07100 Sassari, Italy

Keyword(s): Semantic Web, Dataset, Benchmark, Ontology, Information Extraction.

Abstract: The amount of data available on the Web has grown significantly in the past years, increasing thus the need for efficient techniques able to retrieve information from data in order to discover valuable and relevant knowledge. In the last decade, the intersection of the Information Extraction and Semantic Web areas is providing new opportunities for improving ontology-based information extraction tools. However, one of the critical aspects in the development and evaluation of this type of system is the limited availability of existing annotated documents, especially in domains such as the historical one. In this paper we present the current state of affairs about our work in building a large and real-world RDF dataset with the purpose to support the development of Ontology-Based extraction tools. The presented dataset is the result of the efforts made within the ARKIVO project and it counts about 300 thousand triples, which are the outcome of the manually annotation process executed b y domain experts. ARKIVO dataset is freely available and it can be used as a benchmark for the evaluation of systems that automatically annotate and extract entities from documents. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.191.46.36

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Pandolfo, L. and Pulina, L. (2021). ARKIVO Dataset: A Benchmark for Ontology-based Extraction Tools. In Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST; ISBN 978-989-758-536-4; ISSN 2184-3252, SciTePress, pages 341-345. DOI: 10.5220/0010677000003058

@conference{webist21,
author={Laura Pandolfo. and Luca Pulina.},
title={ARKIVO Dataset: A Benchmark for Ontology-based Extraction Tools},
booktitle={Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST},
year={2021},
pages={341-345},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010677000003058},
isbn={978-989-758-536-4},
issn={2184-3252},
}

TY - CONF

JO - Proceedings of the 17th International Conference on Web Information Systems and Technologies - WEBIST
TI - ARKIVO Dataset: A Benchmark for Ontology-based Extraction Tools
SN - 978-989-758-536-4
IS - 2184-3252
AU - Pandolfo, L.
AU - Pulina, L.
PY - 2021
SP - 341
EP - 345
DO - 10.5220/0010677000003058
PB - SciTePress