Semantic Knowledge Base Construction from Radiology Reports

Eriksson Monteiro, Pedro Sernadela, Sérgio Matos, Carlos Costa, José Luís Oliveira


The tremendous quantity of data stored daily in healthcare institutions demands the development of new methods to summarize and reuse available information in clinical practice. In order to leverage modern healthcare information systems, new strategies must be developed that address challenges such as extraction of relevant information, data redundancy, and the lack of associations within the data. This article proposes a pipeline to overcome these challenges in the context of medical imaging reports, by automatically extracting and linking information, and summarizing natural language reports into an ontology model. Using data from the Physionet MIMIC II database, we created a semantic knowledge base with more than 6.5 millions of triples obtained from a collection of 16,000 radiology reports.


  1. Baldridge, J., 2005. The opennlp project. URL: http://opennlp. apache. org/index. html,(accessed 2 February 2012).
  2. Bastiao Silva, L., Costa, C. & Oliveira, J.L., 2014. Semantic Search over DICOM Repositories. In Healthcare Informatics (ICHI), 2014 IEEE International Conference on. IEEE, pp. 238-246.
  3. Belleau, F. et al., 2008. Bio2RDF: towards a mashup to build bioinformatics knowledge systems. Journal of biomedical informatics, 41(5), pp.706-16. Available at: S1532046408000415 [Accessed June 25, 2015].
  4. Berners-Lee, T., Hendler, J. & Lassila, O., 2001. The semantic web. Scientific American, 284.5, pp.28-37. Available at: svn-history/r347/trunk/RPC/Slides/p01_theSemantic Web.pdf [Accessed July 8, 2014].
  5. Bird, S., 2006. NLTK. In Proceedings of the COLING/ACL on Interactive presentation sessions -. Morristown, NJ, USA: Association for Computational Linguistics, pp. 69-72. Available at: [Accessed June 25, 2015].
  6. Campos, D., Matos, S. & Oliveira, J.L., 2013a. Gimli: open source and high-performance biomedical name recognition. BMC bioinformatics, 14(1), p.54. Available at: 2105/14/54 [Accessed June 25, 2015].
  7. Campos, D., Matos, S. & Oliveira, J.L., 2013b. Neji: a tool for heterogeneous biomedical concept identification. Proceedings of BioLINK SIG, 2013, pp.28-31.
  8. Cunningham, H., GATE, a General Architecture for Text Engineering. Computers and the Humanities, 36(2), pp.223-254. Available at: article/10.1023/A%3A1014348124664 [Accessed June 25, 2015].
  9. Ferrucci, D. & Lally, A., 2004. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering, 10(3-4), pp.327-348. Available at: 03523 [Accessed June 25, 2015].
  10. Gaillard, F. & Jones, J., 2009. Collaborative Radiology Resources: Radiopaedia. org as an Example of a Web 2.0 Radiology Resource. In AMERICAN JOURNAL OF ROENTGENOLOGY. AMER ROENTGEN RAY SOC 1891 PRESTON WHITE DR, SUBSCRIPTION FULFILLMENT, RESTON, VA 22091 USA.
  11. Hahn, U. et al., 2008. An overview of JCoRe, the JULIE lab UIMA component repository. In Proceedings of the LREC. pp. 1-7.
  12. Howe, D. et al., 2008. Big data: The future of biocuration. Nature, 455(7209), pp.47-50. Available at: [Accessed January 28, 2015].
  13. Jonquet, C. et al., 2009. NCBO annotator: semantic annotation of biomedical data. In International Semantic Web Conference.
  14. Kahn, C.E. & Thao, C., 2007. GoldMiner: a radiology image search engine. AJR. American journal of roentgenology, 188(6), pp.1475-8. Available at: 0 [Accessed June 25, 2015].
  15. Laleci, G.B., Yuksel, M. & Dogac, A., 2013. Providing semantic interoperability between clinical care and clinical research domains. IEEE journal of biomedical and health informatics, 17(2), pp.356-69. Available at: [Accessed May 22, 2015].
  16. Leech, G., 1993. Corpus Annotation Schemes. Literary and Linguistic Computing, 8(4), pp.275-281. Available at: 8/4/275.short [Accessed June 25, 2015].
  17. Liu, S. et al., 2005. RxNorm: prescription for electronic drug information exchange. IT Professional, 7(5), pp.17-23. Available at: articleDetails.jsp?arnumber=1516084 [Accessed June 25, 2015].
  18. Lopes, P. & Oliveira, J.L., 2011. A semantic web application framework for health systems interoperability. In Proceedings of the first international workshop on Managing interoperability and complexity in health systems - MIXHS 7811. New York, New York, USA: ACM Press, p. 87. Available at: [Accessed April 23, 2013].
  19. Lopes, P. & Oliveira, J.L., 2012. COEUS: “semantic web in a box” for biomedical applications. Journal of biomedical semantics, 3(1), p.11. Available at: [Accessed March 5, 2013].
  20. Minnie, A., 2002. AuntMinnie. com Launches New Resource Focused on Diagnostic Imaging Centers. Tucson, Arizona.
  21. Pathak, J., Kiefer, R. & Chute, C., 2012. Using semantic web technologies for cohort identification from electronic health records for clinical research. AMIA Summits on Translational Science Proceedings 2012. Available at: 57/ [Accessed May 22, 2015].
  22. Prud'Hommeaux, E. & Seaborne, A., 2008. SPARQL query language for RDF. W3C recommendation, 15.
  23. Rebholz-Schuhmann, D. et al., 2008. Text processing through Web services: calling Whatizit. Bioinformatics (Oxford, England), 24(2), pp.296-8. Available at: 96.short [Accessed June 25, 2015].
  24. Rodríguez-González, A. et al., 2012. SeDeLo: using semantics and description logics to support aided clinical diagnosis. Journal of medical systems, 36(4), pp.2471-81. Available at: http://www.ncbi.nlm.nih. gov/pubmed/21537850 [Accessed April 10, 2015].
  25. Saeed, M. et al., 2011. Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database. Critical care medicine, 39(5), pp.952-60. Available at: http://www. 12&tool=pmcentrez&rendertype=abstract [Accessed June 9, 2015].
  26. Savova, G.K. et al., 2010. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association?: JAMIA, 17(5), pp.507-13. Available at: ct [Accessed November 24, 2014].
  27. Sernadela, P., Lopes, P. & Campos, D., 2015. A Semantic Layer for Unifying and Exploring Biomedical Document Curation Results. Bioinformatics and Biomedical Engineering. Springer International Publishing. Available at: chapter/10.1007/978-3-319-16483-0_2 [Accessed June 24, 2015].
  28. Stearns, M.Q. et al., 2001. SNOMED clinical terms: overview of the development process and project status. Proceedings / AMIA ... Annual Symposium. AMIA Symposium, pp.662-6. Available at: rtid=2243297&tool=pmcentrez&rendertype=abstract [Accessed June 25, 2015].
  29. Tao, C., Solbrig, H. & Chute, C., 2011. CNTRO 2.0: a harmonized semantic web ontology for temporal relation inferencing in clinical narratives. AMIA Summits on Translational Science Proceedings 2011. Available at: articles/PMC3248753/ [Accessed May 22, 2015].
  30. Tao, C., Solbrig, H. & Sharma, D., 2010. Time-oriented question answering from clinical narratives using semantic-web techniques. The Semantic Web-ISWC 2010. Available at: chapter/10.1007/978-3-642-17749-1_16 [Accessed May 22, 2015].
  31. Thompson, R. et al., 2014. RD-Connect: an integrated platform connecting databases, registries, biobanks and clinical bioinformatics for rare disease research. Journal of general internal medicine, 29 Suppl 3, pp.S'0-7.
  32. Zhou, G. et al., 2004. Recognizing names in biomedical texts: a machine learning approach. Bioinformatics (Oxford, England), 20(7), pp.1178-90. Available at: 178.short [Accessed May 28, 2015].

Paper Citation

in Harvard Style

Monteiro E., Sernadela P., Matos S., Costa C. and Oliveira J. (2016). Semantic Knowledge Base Construction from Radiology Reports . In Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, (BIOSTEC 2016) ISBN 978-989-758-170-0, pages 345-352. DOI: 10.5220/0005709503450352

in Bibtex Style

author={Eriksson Monteiro and Pedro Sernadela and Sérgio Matos and Carlos Costa and José Luís Oliveira},
title={Semantic Knowledge Base Construction from Radiology Reports},
booktitle={Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, (BIOSTEC 2016)},

in EndNote Style

JO - Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 5: HEALTHINF, (BIOSTEC 2016)
TI - Semantic Knowledge Base Construction from Radiology Reports
SN - 978-989-758-170-0
AU - Monteiro E.
AU - Sernadela P.
AU - Matos S.
AU - Costa C.
AU - Oliveira J.
PY - 2016
SP - 345
EP - 352
DO - 10.5220/0005709503450352