loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Ermelinda Oro ; Francesco Riccetti and Massimo Ruffolo

Affiliation: University of Calabria, Italy

Keyword(s): Information extraction, Web wrapping, PDF wrapping, Spatial reasoning, Grammars, Chart parsing.

Related Ontology Subjects/Areas/Topics: Agents ; Artificial Intelligence ; Biomedical Engineering ; Biomedical Signal Processing ; Data Manipulation ; Health Engineering and Technology Applications ; Human-Computer Interaction ; Methodologies and Methods ; Neurocomputing ; Neurotechnology, Electronics and Informatics ; Pattern Recognition ; Physiological Computing Systems ; Sensor Networks ; Soft Computing ; Vision and Perception ; Web Information Systems and Technologies ; Web Intelligence

Abstract: In last years the huge relevance of accessing and acquiring information made available byWeb (HTML) pages and business (PDF) documents has grown much further. In this paper we present a textual query language, named ViQueL, whose main feature is to identify and extract relevant information from HTML and PDF documents on the base of their visual appearance by using easy-to-write queries. The proposed language is founded on spatial grammars, i.e. context free grammars extended by spatial constructs. Despite a considerable expressive power, combined complexity of ViQueL is in P-Time. Moreover, experiments show that ViQueL is reasonably efficient for real-life extraction tasks.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.84.228.68

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Oro, E.; Riccetti, F. and Ruffolo, M. (2011). A SPATIAL QUERY LANGUAGE FOR PRESENTATION-ORIENTED DOCUMENTS. In Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-8425-40-9; ISSN 2184-433X, SciTePress, pages 306-312. DOI: 10.5220/0003177603060312

@conference{icaart11,
author={Ermelinda Oro. and Francesco Riccetti. and Massimo Ruffolo.},
title={A SPATIAL QUERY LANGUAGE FOR PRESENTATION-ORIENTED DOCUMENTS},
booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2011},
pages={306-312},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003177603060312},
isbn={978-989-8425-40-9},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - A SPATIAL QUERY LANGUAGE FOR PRESENTATION-ORIENTED DOCUMENTS
SN - 978-989-8425-40-9
IS - 2184-433X
AU - Oro, E.
AU - Riccetti, F.
AU - Ruffolo, M.
PY - 2011
SP - 306
EP - 312
DO - 10.5220/0003177603060312
PB - SciTePress