loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Margarita Suzdaltseva ; Alexandra Shamakhova ; Natalia Dobrenko ; Olga Alekseeva ; Jaafar Hammoud ; Natalia Gusarova ; Aleksandra Vatian and Anatoly Shalyto

Affiliation: ITMO University, 49 Kronverksky av., St. Petersburg, Russia

Keyword(s): Medical Information, De-identification, Multimodal Datasets, Named Entity Recognition, Electronic Healthcare Record, Rule-based Approach.

Abstract: An important source of medical information for forming multimodal datasets to train neural networks is electronic patient records. In order to process data from electronic health records with a specified purpose, the number of requirements must be met - first of all, de-identification. This paper discusses the first stage of this process - searching for named entities in medical texts (which should be replaced or encrypted afterwards). The problem is solved by an example of semi-structured EHRs in Russian as a fusional, grammatically complex language. The structure and specificity of EMC typical for Russia is analyzed in detail. A problem-oriented comparison of approaches to solving the NER problem is carried out. We developed a pipeline for processing of HER and experimentally showed the advantages of the rule-based method over using specialized libraries. The achieved Recall and Precision values were 0.990 and 0.980 respectively.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.129.247.196

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Suzdaltseva, M.; Shamakhova, A.; Dobrenko, N.; Alekseeva, O.; Hammoud, J.; Gusarova, N.; Vatian, A. and Shalyto, A. (2021). De-identification of Medical Information for Forming Multimodal Datasets to Train Neural Networks. In Proceedings of the 7th International Conference on Information and Communication Technologies for Ageing Well and e-Health - ICT4AWE; ISBN 978-989-758-506-7; ISSN 2184-4984, SciTePress, pages 163-170. DOI: 10.5220/0010406000002931

@conference{ict4awe21,
author={Margarita Suzdaltseva. and Alexandra Shamakhova. and Natalia Dobrenko. and Olga Alekseeva. and Jaafar Hammoud. and Natalia Gusarova. and Aleksandra Vatian. and Anatoly Shalyto.},
title={De-identification of Medical Information for Forming Multimodal Datasets to Train Neural Networks},
booktitle={Proceedings of the 7th International Conference on Information and Communication Technologies for Ageing Well and e-Health - ICT4AWE},
year={2021},
pages={163-170},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010406000002931},
isbn={978-989-758-506-7},
issn={2184-4984},
}

TY - CONF

JO - Proceedings of the 7th International Conference on Information and Communication Technologies for Ageing Well and e-Health - ICT4AWE
TI - De-identification of Medical Information for Forming Multimodal Datasets to Train Neural Networks
SN - 978-989-758-506-7
IS - 2184-4984
AU - Suzdaltseva, M.
AU - Shamakhova, A.
AU - Dobrenko, N.
AU - Alekseeva, O.
AU - Hammoud, J.
AU - Gusarova, N.
AU - Vatian, A.
AU - Shalyto, A.
PY - 2021
SP - 163
EP - 170
DO - 10.5220/0010406000002931
PB - SciTePress