A Pipeline Supporting a Smart Access to Historical Documents based on a Rich Semantic Representation of Their Content: A Case Study on Time Expressions

Alessandro Baldo, Anna Goy, Diego Magro

2018

Abstract

This work is part of two ongoing projects whose main goal is to demonstrate how semantic technologies can support an effective access to historical archives. In this paper we present a full pipeline, from rough texts up to the final user interface, aimed at creating and exploiting such representations. The pipeline is structured in three modules - handling information extraction, semantic representations, and queries - and offers external applications the possibility of accessing, and thus re-using, the output of each module, by providing a tagged text, a SPARQL endpoint, and a RESTful web service. In the paper, we describe the details of a proof-of-concept implementation of the pipeline architecture that focuses on time expressions. Moreover, we present an example application that exploits the pipeline to enable users to access historical documents by searching and browsing events and time specifications, thus demonstrating the effectiveness of an access to historical texts based on a rich semantic representation of their content.

Download


Paper Citation


in Harvard Style

Baldo A., Goy A. and Magro D. (2018). A Pipeline Supporting a Smart Access to Historical Documents based on a Rich Semantic Representation of Their Content: A Case Study on Time Expressions.In Proceedings of the 14th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-758-324-7, pages 199-206. DOI: 10.5220/0006929601990206


in Bibtex Style

@conference{webist18,
author={Alessandro Baldo and Anna Goy and Diego Magro},
title={A Pipeline Supporting a Smart Access to Historical Documents based on a Rich Semantic Representation of Their Content: A Case Study on Time Expressions},
booktitle={Proceedings of the 14th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2018},
pages={199-206},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006929601990206},
isbn={978-989-758-324-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - A Pipeline Supporting a Smart Access to Historical Documents based on a Rich Semantic Representation of Their Content: A Case Study on Time Expressions
SN - 978-989-758-324-7
AU - Baldo A.
AU - Goy A.
AU - Magro D.
PY - 2018
SP - 199
EP - 206
DO - 10.5220/0006929601990206