Sentence Boundary Detection in German Legal Documents

Ingo Glaser, Sebastian Moser, Florian Matthes

Abstract

Sentence boundary detection on German legal texts is a task which standardized NLP-systems have little or no ability to handle, since they are sometimes overburdened by more complex structures such as lists, paragraph structures and citations. In this paper we evaluate the performance of these systems and adapt methods directly to the legal domain. We created an annotated dataset with over 50,000 sentences consisting of various German legal documents which can be utilized for further research within the community. Our neural networks and conditional random fields models show significantly higher performances on this data than the tested, already existing systems. Thus this paper contradicts the assumption that the problem of segmenting sentences is already solved.

Download


Paper Citation


in Harvard Style

Glaser I., Moser S. and Matthes F. (2021). Sentence Boundary Detection in German Legal Documents.In Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-484-8, pages 812-821. DOI: 10.5220/0010246308120821


in Bibtex Style

@conference{icaart21,
author={Ingo Glaser and Sebastian Moser and Florian Matthes},
title={Sentence Boundary Detection in German Legal Documents},
booktitle={Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2021},
pages={812-821},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010246308120821},
isbn={978-989-758-484-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Sentence Boundary Detection in German Legal Documents
SN - 978-989-758-484-8
AU - Glaser I.
AU - Moser S.
AU - Matthes F.
PY - 2021
SP - 812
EP - 821
DO - 10.5220/0010246308120821