A Methodology to Reduce the Complexity of Validation Model Creation from Medical Specification Document

Francesco Gargiulo, Stefano Silvestri, Mariarosaria Fontanella, Mario Ciampi

Abstract

In this paper we propose a novel approach to reduce the complexity of the definition and implementation of a medical document validation model. Usually the conformance requirements for specifications are contained in documents written in natural language format and it is necessary to manually translate them in a software model for validation purposes. It should be very useful to extract and group the conformance rules that have a similar pattern to reduce the manual effort needed to accomplish this task. We will show an innovative cluster approach that automatically evaluates the optimal number of groups using an iterative method based on internal cluster measures evaluation. We will show the application of this method on two case studies: i) Patient Summary (Profilo Sanitario Sintetico) and ii) Hospital Discharge Letter (Lettera di Dimissione Ospedaliera) for the Italian specification of the conformance rules.

References

  1. Alicante, A., Corazza, A., Isgr ò, F., and Silvestri, S. (2016a). Semantic cluster labeling for medical relations. Innovation in Medicine and Healthcare 2016, 60:183-193.
  2. Alicante, A., Corazza, A., Isgr ò, F., and Silvestri, S. (2016b). Unsupervised entity and relation extraction from clinical records in Italian. Computers in Biology and Medicine, 72:263-275.
  3. Amato, F., Gargiulo, F., Mazzeo, A., Romano, S., and Sansone, C. (2013). Combining syntactic and semantic vector space models in the health domain by using a clustering ensemble. In HEALTHINF 2013 - Proceedings of the International Conference on Health Informatics, pages 382-385.
  4. Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., K ötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., and Wiswedel, B. (2007). KNIME: The Konstanz Information Miner. In Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007). Springer.
  5. Boscá, D., Maldonado, J. A., Moner, D., and Robles, M. (2015). Automatic generation of computable implementation guides from clinical information models. Journal of Biomedical Informatics, 55:143-152.
  6. Boufahja, A., Poiseau, E., Thomazon, G., and Bergé, A.-G. (2015). Model-based analysis of hl7 cda r2 conformance and requirements coverage. EJBI, 11(2).
  7. Cao, L., Chua, K. S., Chong, W., Lee, H., and Gu, Q. (2003). A comparison of pca, kpca and ica for dimensionality reduction in support vector machine. Neurocomputing, 55(1):321-336.
  8. Cavnar, W. B. and Trenkle, J. M. (1994). N-gram-based text categorization. Ann Arbor MI, 48113(2):161-175.
  9. Ciampi, M., Esposito, A., Guarasci, R., and Pietro, G. D. (2016). Towards interoperability of ehr systems: The case of italy. In Proceedings of the International Conference on Information and Communication Technologies for Ageing Well and e-Health - Volume 1: ICT4AWE,, pages 133-138.
  10. Dhillon, I. S., Guan, Y., and Kogan, J. (2002). Iterative clustering of high dimensional text data augmented by local search. In Data Mining, 2002. ICDM 2003. Proceedings. 2002 IEEE International Conference on, pages 131-138. IEEE.
  11. Eder, M. (2011). Style-markers in authorship attribution a cross-language study of the authorial fingerprint. Studies in Polish Linguistics, 6(1):99-114.
  12. Gargiulo, F., Fontanella, M., and Ciampi, M. (2016). Validazione di documenti sanitari strutturati in hl7 cda rel. 2.0 con schemi schematron. Technical report, Istituto di Calcolo e Reti ad Alte Prestazioni (ICAR) del Consiglio Nazionale delle Ricerche (CNR).
  13. Halkidi, M. and Vazirgiannis, M. (2001). Clustering validity assessment: Finding the optimal partitioning of a data set. In Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on, pages 187- 194. IEEE.
  14. Hamilton, J., Darr, T., Fernandes, R., Jones, D., and Morgan, J. (2015). Rule-based constraints for metadata validation and verification in a multi-vendor environment. In International Telemetering Conference Proceedings. International Foundation for Telemetering.
  15. Handl, J., Knowles, J., and Kell, D. B. (2005). Computational cluster validation in post-genomic data analysis. Bioinformatics, 21(15):3201-3212.
  16. Hornik, K., Feinerer, I., Kober, M., and Buchta, C. (2012). Spherical k-means clustering. Journal of Statistical Software, 50(10):1-22.
  17. Jafarpour, B., Abidi, S. R., and Abidi, S. S. R. (2016). Exploiting semantic web technologies to develop owlbased clinical practice guideline execution engines. IEEE Journal of Biomedical and Health Informatics, 20(1):388-398.
  18. Jelliffe, R. (2001). The schematron assertion language 1.5. Academia Sinica Computing Center.
  19. Karypis, G. (2002). Cluto-a clustering toolkit. Technical report, DTIC Document.
  20. Kaufman, L. and Rousseeuw, P. J. (2009). Finding groups in data: an introduction to cluster analysis, volume 344. John Wiley & Sons.
  21. Liu, Y., Li, Z., Xiong, H., Gao, X., and Wu, J. (2010). Understanding of internal clustering validation measures. In 2010 IEEE International Conference on Data Mining, pages 911-916. IEEE.
  22. Pollard, K. S. and Van Der Laan, M. J. (2002). A method to identify significant clusters in gene expression data. In Proceedings of SCI World Multiconference on Systemics, Cybernetics and Informatics, pages 318-325.
  23. Rend ón, E., Abundez, I., Arizmendi, A., and Quiroz, E. (2011). Internal versus external cluster validation indexes. International Journal of computers and communications, 5(1):27-34.
  24. Rosenberg, A. and Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In EMNLP-CoNLL, volume 7, pages 410- 420.
  25. Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53-65.
  26. Weston, S. and Analytics, R. (2014). doParallel: Foreach parallel adaptor for the parallel package. R package version 1.0.8.
  27. Wu, J., Xiong, H., and Chen, J. (2009). Adapting the right measures for k-means clustering. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 877- 886. ACM.
  28. Zhong, S. (2005). Efficient online spherical K-means clustering. In Proceedings of the IEEE International Joint Conference on Neural Networks, volume 5, pages 3180-3185.
Download


Paper Citation


in Harvard Style

Gargiulo F., Silvestri S., Fontanella M. and Ciampi M. (2017). A Methodology to Reduce the Complexity of Validation Model Creation from Medical Specification Document . In Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: SmartMedDev, (BIOSTEC 2017) ISBN 978-989-758-213-4, pages 497-507. DOI: 10.5220/0006291404970507


in Bibtex Style

@conference{smartmeddev17,
author={Francesco Gargiulo and Stefano Silvestri and Mariarosaria Fontanella and Mario Ciampi},
title={A Methodology to Reduce the Complexity of Validation Model Creation from Medical Specification Document},
booktitle={Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: SmartMedDev, (BIOSTEC 2017)},
year={2017},
pages={497-507},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006291404970507},
isbn={978-989-758-213-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: SmartMedDev, (BIOSTEC 2017)
TI - A Methodology to Reduce the Complexity of Validation Model Creation from Medical Specification Document
SN - 978-989-758-213-4
AU - Gargiulo F.
AU - Silvestri S.
AU - Fontanella M.
AU - Ciampi M.
PY - 2017
SP - 497
EP - 507
DO - 10.5220/0006291404970507