Evaluating Cross-lingual Semantic Annotation for Medical Forms

Ying-Chi Lin, Victor Christen, Anika Groß, Toralf Kirsten, Toralf Kirsten, Silvio Domingos Cardoso, Silvio Domingos Cardoso, Cédric Pruski, Marcos Da Silveira, Erhard Rahm

2020

Abstract

Annotating documents or datasets using concepts of biomedical ontologies has become increasingly important. Such ontology-based semantic annotations can improve the interoperability and the quality of data integration in health care practice and biomedical research. However, due to the restrictive coverage of non-English ontologies and the lack of comparably good annotators as for English language, annotating non-English documents is even more challenging. In this paper we aim to annotate medical forms in German language. We present a parallel corpus where all medical forms are in both German and English languages. We use three annotators to automatically generate annotations and these annotations are manually verified to construct an English Silver Standard Corpus (SSC). Based on the parallel corpus of German and English documents and the SSC, we evaluate the quality of different annotation approaches, mainly 1) direct annotation using German corpus and German ontologies and 2) integrating machine translators to translate German corpus and annotate the translated corpus with English ontologies. The results show that using German ontologies only produces very restricted results, whereas translation achieves better annotation quality and is able to retain almost 70% of the annotations.

Download


Paper Citation


in Harvard Style

Lin Y., Christen V., Groß A., Kirsten T., Cardoso S., Pruski C., Da Silveira M. and Rahm E. (2020). Evaluating Cross-lingual Semantic Annotation for Medical Forms. In Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2020) - Volume 5: HEALTHINF; ISBN 978-989-758-398-8, SciTePress, pages 145-155. DOI: 10.5220/0008979901450155


in Bibtex Style

@conference{healthinf20,
author={Ying-Chi Lin and Victor Christen and Anika Groß and Toralf Kirsten and Silvio Domingos Cardoso and Cédric Pruski and Marcos Da Silveira and Erhard Rahm},
title={Evaluating Cross-lingual Semantic Annotation for Medical Forms},
booktitle={Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2020) - Volume 5: HEALTHINF},
year={2020},
pages={145-155},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008979901450155},
isbn={978-989-758-398-8},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 13th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2020) - Volume 5: HEALTHINF
TI - Evaluating Cross-lingual Semantic Annotation for Medical Forms
SN - 978-989-758-398-8
AU - Lin Y.
AU - Christen V.
AU - Groß A.
AU - Kirsten T.
AU - Cardoso S.
AU - Pruski C.
AU - Da Silveira M.
AU - Rahm E.
PY - 2020
SP - 145
EP - 155
DO - 10.5220/0008979901450155
PB - SciTePress