LEDAC: Optimizing the Performance of the Automatic Classification of Legal Documents through the Use of Word Embeddings

Víctor Labrador, Álvaro Peiró, Ángel Garrido, Eduardo Mena

Abstract

Nowadays, the number of legal documents processed daily prevents the work from being done manually. One of the most relevant processes is the classification of this kind of documents, not only because of the importance of the task itself, but also since it is the starting point for other important tasks such as data search or information extraction. In spite of technological advances, the task of automatic classification is still performed by specialized staff, which is expensive, time-consuming, and subject to human errors. In the best case it is possible to find systems with statistical approaches whose benefits in terms of efficacy and efficiency are limited. Moreover, the presence of overlapping elements in legal documents, such as stamps or signatures distort the text and hinder these automatic tasks. In this work, we present an approach for performing automatic classification tasks over these legal documents which exploits the semantic properties of word embeddings. We have implemented our approach so that it is simple to address different types of documents with little effort. Experimental results with real data show promising results, greatly increasing the productivity of systems based on other approaches.

Download


Paper Citation


in Harvard Style

Labrador V., Peiró Á., Garrido Á. and Mena E. (2020). LEDAC: Optimizing the Performance of the Automatic Classification of Legal Documents through the Use of Word Embeddings.In Proceedings of the 22nd International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-423-7, pages 181-188. DOI: 10.5220/0009421001810188


in Bibtex Style

@conference{iceis20,
author={Víctor Labrador and Álvaro Peiró and Ángel Garrido and Eduardo Mena},
title={LEDAC: Optimizing the Performance of the Automatic Classification of Legal Documents through the Use of Word Embeddings},
booktitle={Proceedings of the 22nd International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2020},
pages={181-188},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009421001810188},
isbn={978-989-758-423-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 22nd International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - LEDAC: Optimizing the Performance of the Automatic Classification of Legal Documents through the Use of Word Embeddings
SN - 978-989-758-423-7
AU - Labrador V.
AU - Peiró Á.
AU - Garrido Á.
AU - Mena E.
PY - 2020
SP - 181
EP - 188
DO - 10.5220/0009421001810188