Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents

Alice Nannini; Federico A. Galatolo; Mario G. C. A. Cimino; Gigliola Vaglini

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents

Topics: Deep Learning; Neural Network Software and Applications

In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS, 610-615, 2022

Authors: Alice Nannini ; Federico A. Galatolo ; Mario G. C. A. Cimino and Gigliola Vaglini

Affiliation: Department of Information Engineering, University of Pisa, Largo L. Lazzarino 1, Pisa, Italy

Keyword(s): Deep Learning, Computer Vision, Object Detection, Region Proposal, Document Layout Analysis, Information Extraction, Transfer Learning.

Abstract: The computer vision and object detection techniques developed in recent years are dominating the state of the art and are increasingly applied to document layout analysis. In this research work, an automatic method to extract meaningful information from scanned documents is proposed. The method is based on the most recent object detection techniques. Specifically, the state-of-the-art deep learning techniques that are designed to work on images, are adapted to the domain of digital documents. This research focuses on play scripts, a document type that has not been considered in the literature. For this reason, a novel dataset has been annotated, selecting the most common and useful formats from hundreds of available scripts. The main contribution of this paper is to provide a general understanding and a performance study of different implementations of object detectors applied to this domain. A fine-tuning of deep neural networks, such as Faster R-CNN and YOLO, has been made to ident ify text sections of interest via bounding boxes, and to classify them into a specific pre-defined category. Several experiments have been carried out, applying different combinations of data augmentation techniques. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.157

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Nannini, A., Galatolo, F. A., Cimino, M. G. C. A., Vaglini and G. (2022). Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents. In Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS; ISBN 978-989-758-569-2; ISSN 2184-4992, SciTePress, pages 610-615. DOI: 10.5220/0011090600003179

@conference{iceis22,
author={Alice Nannini and Federico A. Galatolo and Mario G. C. A. Cimino and Gigliola Vaglini},
title={Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents},
booktitle={Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS},
year={2022},
pages={610-615},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011090600003179},
isbn={978-989-758-569-2},
issn={2184-4992},
}

TY - CONF

JO - Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS
TI - Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents
SN - 978-989-758-569-2
IS - 2184-4992
AU - Nannini, A.
AU - Galatolo, F.
AU - Cimino, M.
AU - Vaglini, G.
PY - 2022
SP - 610
EP - 615
DO - 10.5220/0011090600003179
PB - SciTePress