Table Interpretation and Extraction of Semantic Relationships to Synthesize Digital Documents

Martha O. Perez-Arriaga, Trilce Estrada, Soraya Abad-Mota

Abstract

The large number of scientific publications produced today prevents researchers from analyzing them rapidly. Automated analysis methods are needed to locate relevant facts in a large volume of information. Though publishers establish standards for scientific documents, the variety of topics, layouts, and writing styles impedes the prompt analysis of publications. A single standard across scientific fields is infeasible, but common elements tables and text exist by which to analyze publications from any domain. Tables offer an additional dimension describing direct or quantitative relationships among concepts. However, extracting tables information, and unambiguously linking it to its corresponding text to form accurate semantic relationships are non-trivial tasks. We present a comprehensive framework to conceptually represent a document by extracting its semantic relationships and context. Given a document, our framework uses its text, and tables content and structure to identify relevant concepts and relationships. Additionally, we use the Web and ontologies to perform disambiguation, establish a context, annotate relationships, and preserve provenance. Finally, our framework provides an augmented synthesis for each document in a domain-independent format. Our results show that by using information from tables we are able to increase the number of highly ranked semantic relationships by a whole order of magnitude.

Download


Paper Citation


in Harvard Style

Perez-Arriaga M., Estrada T. and Abad-Mota S. (2017). Table Interpretation and Extraction of Semantic Relationships to Synthesize Digital Documents . In Proceedings of the 6th International Conference on Data Science, Technology and Applications - Volume 1: DATA, ISBN 978-989-758-255-4, pages 223-232. DOI: 10.5220/0006436902230232


in Bibtex Style

@conference{data17,
author={Martha O. Perez-Arriaga and Trilce Estrada and Soraya Abad-Mota},
title={Table Interpretation and Extraction of Semantic Relationships to Synthesize Digital Documents},
booktitle={Proceedings of the 6th International Conference on Data Science, Technology and Applications - Volume 1: DATA,},
year={2017},
pages={223-232},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006436902230232},
isbn={978-989-758-255-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Data Science, Technology and Applications - Volume 1: DATA,
TI - Table Interpretation and Extraction of Semantic Relationships to Synthesize Digital Documents
SN - 978-989-758-255-4
AU - Perez-Arriaga M.
AU - Estrada T.
AU - Abad-Mota S.
PY - 2017
SP - 223
EP - 232
DO - 10.5220/0006436902230232