
dational component in any pipeline that aims to lever-
age artificial intelligence for clinical document anal-
ysis. Without reliable text extraction, the potential of
AI to derive insights from vast volumes of archived
handwritten data remains largely untapped. Future
work should explore the integration of domain-trained
models, active learning strategies, and multimodal
document representations to improve recognition ac-
curacy and usability in medical and archival settings.
REFERENCES
Agenzia nazionale per i servizi sanitari region-
ali (2023). Piattaforma di telemedicina e
FSE. https://www.agenas.gov.it/comunicazione/
primo-piano/2090-piattaforma-telemedicina-fse.
Accessed 8-May-2025].
AgID (2020). Sistema pubblico di identit
`
a digitale. https:
//www.spid.gov.it. Accessed 17-April-2025.
Amazon Web Services (2019). Amazon Textract – Fully
Managed ML for Text and Data Extraction. https:
//docs.aws.amazon.com/textract/. General Availabil-
ity announced November 28, 2018; service available
from May 2019, accessed on 2025-05-15.
Bender, D. and Sartipi, K. (2013). HL7 FHIR: An agile
and RESTful approach to healthcare information ex-
change. In Proceedings of the 26th IEEE Interna-
tional Symposium on Computer-Based Medical Sys-
tems, pages 326–331.
Carchiolo, V. and Malgeri, M. (2025). Trends, challenges,
and applications of large language models in health-
care: A bibliometric and scoping review. Future In-
ternet, 17(2).
Carchiolo, V., Malgeri, M., and Sapari, L. S. (2026). A con-
versational agent for handling health report inquiries.
Communications in Computer and Information Sci-
ence, 2518 CCIS:202 – 211.
European Commission (2025a). eHealth network. https:
//health.ec.europa.eu/ehealth-digital-health-and-care/
digital-health-and-care/eu-cooperation/
ehealth-network. Accessed 07-May-2025.
European Commission (2025b). European health
data space regulation (EHDS). https:
//health.ec.europa.eu/ehealth-digital-health-and-care/
european-health-data-space-regulation-ehds. Ac-
cessed: 2025-05-07.
Federazione Nazionale Ordine Medici Chirurghi ed Odon-
toiatri (2014). Nuovo codice di deontologia
medica. https://www.health-management.it/codice\
dentologico/cdm\ 03\ 25\ 26.htm.
Google Cloud Platform (2016). Google Cloud Vision API.
https://cloud.google.com/vision. Accessed: 2025-05-
15.
Jiang, A. Q., Sablayrolles, A., Roux, A., Mensch, A.,
Savary, B., Bamford, C., Chaplot, D. S., de las Casas,
D., Hanna, E. B., Bressand, F., Lengyel, G., Bour, G.,
Lample, G., Lavaud, L. R., Saulnier, L., Lachaux, M.-
A., Stock, P., Subramanian, S., Yang, S., Antoniak, S.,
Scao, T. L., Gervet, T., Lavril, T., Wang, T., Lacroix,
T., and Sayed, W. E. (2024). Mixtral of experts.
Karthikeyan, S., de Herrera, A. G. S., Doctor, F., and Mirza,
A. (2022). An OCR post-correction approach using
deep learning for processing medical reports. IEEE
Transactions on Circuits and Systems for Video Tech-
nology, 32(5):2574–2581.
Li, Y., Wei, Q., Chen, X., Li, J., Tao, C., and Xu, H. (2024).
Improving tabular data extraction in scanned labora-
tory reports using deep learning models. Journal of
Biomedical Informatics, 159:104735.
Microsoft (2025). What is azure AI document intelligence?
https://learn.microsoft.com/en-us/azure/ai-services/
document-intelligence/overview?view=doc-intel-4.0.
0”. Accessed 1-April-2025.
Microsoft (n.a.). Azure cognitive services —
computer vision ocr documentation. https:
//learn.microsoft.com/en-us/azure/cognitive-services/
computer-vision/concept-recognizing-text. Ac-
cessed: 2025-05-15.
Ministero della Sanit
`
a (1977). Determinazione dei requisiti
tecnici sulle case di cura private. http://architettura.it/
notes/ns\ nazionale/anno\ 70-79/D.M.5-8-77.html.
Nandhinee, P., Harinath, K., Koushik, S., Anil, G., and Su-
darsun, S. (2022). DEXTER: An end-to-end system to
extract table contents from electronic medical health
documents. arXiv preprint arXiv:2207.06823. Avail-
able at: https://arxiv.org/abs/2207.06823.
Proton Technologies AG (GDPR.EU) (2018). General data
protection regulation (GDPR). https://gdpr.eu/tag/
gdpr. Accessed 17-April-2025.
Rasmussen, L. V., Peissig, P. L., McCarty, C. A., and Star-
ren, J. (2012). Development of an optical character
recognition pipeline for handwritten form fields from
an electronic health record. Journal of the American
Medical Informatics Association, 19(e1):e90–e95.
Tan, Y. F., Connie, T., Goh, M. K. O., and Teoh, A. B. J.
(2022). A pipeline approach to context-aware hand-
written text recognition. Applied Sciences, 12(4).
Wang, X.-F., He, Z.-H., Wang, K., Wang, Y.-F., Zou, L.,
and Wu, Z.-Z. (2023). A survey of text detection and
recognition algorithms based on deep learning tech-
nology. Neurocomputing, 556:126702.
White-Dzuro, C. G., Schultz, J. D., Ye, C., Coco, J. R.,
Myers, J. M., Shackelford, C., Rosenbloom, S. T.,
and Fabbri, D. (2021). Extracting medical informa-
tion from paper COVID-19 assessment forms. Applied
Clinical Informatics, 12(1):170–178.
Willis, N. (2025). IFHIR adoption statistics in 2025: A
global overview. https://www.linuxactionshow.com/
fhir-adoption-statistics-in-2025-a-global-overview.
Accessed 8-May-2025].
World Health Organization (2025). International sta-
tistical classification of diseases and related health
problems (ICD). https://www.who.int/classifications/
classification-of-diseases. Accessed 8-May-2025.
Paper-Based Health Records: A Case Study on the Digitization of Handwritten Clinical Records
251