Towards a Language for Representing and Managing the Semantics of Big Data

Ermelinda Oro, Massimo Ruffolo, Pietro Gentile, Giuseppe Bartone

2014

Abstract

The amount of data in our world has been exploding. Integrating, managing and analyzing large amounts of data – i.e. Big Data - will become a key issue for businesses for better operating and competing in today’s markets. Data are only useful if used in a smart way. We introduce the concept of Smart Data that is web and enterprise structured and unstructured big data with explicit and implicit semantics that leverages context to understand intent for better driving business processes and for better and more informed decisions making. This paper proposes a language able to give a representation of Big Data based on ontologies and a system that implements an approach capable to satisfy the increasing need for efficiency and scalability in semantic data management. The proposed MANTRA Language allows for: (i) representing the semantics of data by knowledge representation constructs; (ii) acquiring data from disparate heterogeneous sources (e.g. data bases, documents); (iii) integrating and managing data; (iv) reasoning and querying with Big Data. The syntax of the proposed language is partially derived from logic programming, but the semantic is completely revised. The novelty of the language we propose is that a class can be thought of as a flexible collection of structurally heterogeneous individuals that have different properties (schema-less). The language also allows executing efficient querying and reasoning for revealing implicit knowledge. These have been achieved by using a triple-based data persistency model and a scalable No-SQL storage system.

References

  1. Baader, F., Calvanese, D., McGuinness, D. L., Nardi, D., Patel-Schneider, P. F., 2003. The Description Logic Handbook: Theory, Implementation, and Applications, Cambridge University Press, Cambridge.
  2. Berners-Lee, T., Hendler, J., Lassila, O., 2001. The Semantic Web. Scientific American, 279 (5): p.34-43.
  3. Bizer, C., Heath, T., Berners-Lee, T., 2009. Linked Data - The Story So Far. International Journal on Semantic Web and Information Systems, Volume 5.
  4. Blomqvist, E., 2012. The use of Semantic Web Technologies for Decision Support - A Survey, Semantic Web.
  5. Calimeri, F., Galizia, S., Ruffolo, M., Rullo, P., 2003. OntoDLP: a Logic Formalism for Knowledge Representation. Anwer Set Programming.
  6. Cimiano, P., Haase, P., Herold, M., Mantel, M., Buitelaar, P., 2007. LexOnto: A Model for Ontology Lexicons for Ontology-based NLP. In Proceedings OntoLex (Workshop ISWC).
  7. Dao, F., 2011. Semantic technologies for enterprises. Technical report, SAP AG.
  8. Dean, J., Ghemawat, S., 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51.
  9. Haase, P., Lewen, H., Studer, R., Erdmann, M., Gmbh, O., 2008. The NeOn Ontology Engineering Toolkit. In 17th International World Wide Web Conference.
  10. Motta, E., 1999. Reusable Components for Knowledge Modelling, IOS Press. Amsterdam, The Netherlands.
  11. Neo4j Graph Database, http://www.neo4j.org/. Retrieved 12/2013.
  12. Oro, E., Ruffolo, M., 2008. XONTO: An Ontology-Based System for Semantic Information Extraction from PDF Documents. ICTAI.
  13. Oro E., Ruffolo, M., Saccà, D., 2009. Ontology-Based Information Extraction from PDF Documents with Xonto. International Journal on Artificial Intelligence Tools 18(5): 673-695.
  14. Ricca, F., Leone, N., 2007. Disjunctive logic programming with types and objects: The DLV+ system, Journal of Applied Logic, Volume 5, Issue 3, 545-573.
  15. Smith, M. K., Welty, C., McGuinnes, A.L., 2003. OWL web ontology language guide. World Wide Web Consortium.
  16. Staab, S., Erdmann, M., Studer, R., Sure, Y., Schnurr, H.- P., 2001. Knowledge Processes and Ontologies. IEEE Intelligent Systems, 16(1): p.26-34.
  17. Sun, J., Jin, Q., 2010. Scalable RDF Store Based on HBase and MapReduce. ICACTE.
  18. W3C. RDF. The Resource Description Framework. http://www.w3.org/RDF/. Retrieved 12/2013.
Download


Paper Citation


in Harvard Style

Oro E., Ruffolo M., Gentile P. and Bartone G. (2014). Towards a Language for Representing and Managing the Semantics of Big Data . In Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, ISBN 978-989-758-015-4, pages 651-656. DOI: 10.5220/0004916906510656


in Bibtex Style

@conference{icaart14,
author={Ermelinda Oro and Massimo Ruffolo and Pietro Gentile and Giuseppe Bartone},
title={Towards a Language for Representing and Managing the Semantics of Big Data},
booktitle={Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
year={2014},
pages={651-656},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004916906510656},
isbn={978-989-758-015-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,
TI - Towards a Language for Representing and Managing the Semantics of Big Data
SN - 978-989-758-015-4
AU - Oro E.
AU - Ruffolo M.
AU - Gentile P.
AU - Bartone G.
PY - 2014
SP - 651
EP - 656
DO - 10.5220/0004916906510656