loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Fhabiana T. Machado ; Deise Saccol ; Eduardo Piveta ; Renata Padilha and Ezequiel Ribeiro

Affiliation: Federal University of Santa Maria, Santa Maria, Brazil

Keyword(s): JSON, Schema Extraction, Information Integration.

Abstract: NoSQL (Not Only SQL) document-oriented databases stand out because of the need for scalability. This storage model promises flexibility in documents, using files and data sources in JSON (JavaScript Object Notation) format. It also allows documents within the same collection to have different fields. Such differences occur in database integration scenarios. When the user needs to access different datasources in an unified way, it can be troublesome, as there is no standardization in the structures. In this sense, this work presents a process for conceptual schema extraction in JSON datasets. Our proposal analyzes fields representing the same information, but written differently. In the context of this work, differences in writing are related to treatment of synonyms and character. To perform this analysis, techniques such as character-based and knowledge-based similarity functions, as well as stemming are used. Therefore, we specify a process to extract the implicit schema present in these data sources, applying different textual equivalence techniques in field names. We applied the process in an experiment from the scientific publications domain, correctly identifying 80% of the equivalent terms. This process outputs an unified conceptual schema and the respective mappings for the equivalent terms contributing to the schema integration’s problem. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.138.113.188

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Machado, F.; Saccol, D.; Piveta, E.; Padilha, R. and Ribeiro, E. (2021). A Text Similarity-based Process for Extracting JSON Conceptual Schemas. In Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS; ISBN 978-989-758-509-8; ISSN 2184-4992, SciTePress, pages 264-271. DOI: 10.5220/0010475102640271

@conference{iceis21,
author={Fhabiana T. Machado. and Deise Saccol. and Eduardo Piveta. and Renata Padilha. and Ezequiel Ribeiro.},
title={A Text Similarity-based Process for Extracting JSON Conceptual Schemas},
booktitle={Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS},
year={2021},
pages={264-271},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010475102640271},
isbn={978-989-758-509-8},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS
TI - A Text Similarity-based Process for Extracting JSON Conceptual Schemas
SN - 978-989-758-509-8
IS - 2184-4992
AU - Machado, F.
AU - Saccol, D.
AU - Piveta, E.
AU - Padilha, R.
AU - Ribeiro, E.
PY - 2021
SP - 264
EP - 271
DO - 10.5220/0010475102640271
PB - SciTePress