SalamboMiner - A Biomedical Literature Mining Tool for Inferring the Genetics of Complex Diseases

Leonor Rib, Ricard Gavaldà, Jose Manuel Soria, Alfonso Buil

2011

Abstract

In the Era of Information researchers have utilized the Web to make their knowledge readily available. The Web is an important tool to improve the communication in the research community. But, the large amounts of information available makes it difficult to access the information that is needed. We present SalamboMiner, a Text-Mining tool that helps biomedical researchers to obtain the information about the genetics of complex diseases which is in the published biomedical literature. The methodology is based in the idea of co-citation: the co-citation of two concepts gives the significance of the relationship between the pair of concepts. In addition, the co-citation allows to infer new relationships that are not explicitly said in the literature. By using a Bayesian network, we infer the significant relationships between those concepts that are co-cited in two steps.

References

  1. Aerts, S. e. a. (2006). Gene prioritization through genomic data fusion. Nature Biotechnology, 24:537 - 544.
  2. Bard, J. B. L. and Rhee, S. Y. (2004). Ontologies in biology: design, applications and future challenges. Nature Reviews Genetics, 5:213-222.
  3. Cheng, D., Knox, C., Young, N., Stothard, P., Damaraju, S., and Wishart, D. S. (2008). Polysearch: a webbased text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites. Nucleic Acids Res, 36.
  4. Conlon, C. (1978). Telangiectasia and von willebrand's disease in two families. Annals of Internal Medicine, 89:921-924.
  5. Gajendrana, V. K., Linb, J.-R., and Fyhrie, D. P. (2007). An application of bioinformatics and text mining to the discovery of novel genes related to bone biology. Bone, 40(5):1378-1388.
  6. Gola, A. (1977). Ein fall von morbus osler mit gleichzeitig bestehender thrombozytopenie und einem faktor viiiinhibitor. Folia Haematologica, 104:102-108.
  7. Gu, H. (2007). Evaluation of a umls auditing process of semantic type assignments. AMIA Annu Symp Proc, page 294298.
  8. Hanna, W. (1984). A study of a caucasian family with variant von willebrand's disease in association with vascular telangiectasia and haemoglobinopathy. Thrombosis and Haemostasis, 51:275-278.
  9. Hristovski, D., Peterlin, B., Mitchell, J. A., and Humphrey, S. M. (2005). Using literature-based discovery to identify disease candidate genes. International Journal of Medical Informatics, 74(2-4):289-98.
  10. Hunter, L. (2006). Biomedical language processing: Perspective whats beyond pubmed? Molecular cell, 21(5):589594.
  11. Jeffreys, H. (1961). The Theory of Probability. Oxford University Press.
  12. Jelier, R., Schuemie, M. J., Veldhoven, A., Dorssers, L. C., Jenster, G., and Kors, J. A. (2008). Anni 2.0: a multipurpose text-mining tool for the life sciences. Genome Biology, 9(6).
  13. Jenssen, T. K., Laegreid, A., Komorowski, J., and Hovig, E. (2001). A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics, 28(1):21-8.
  14. Lakoff, G. and Johnson, M. (1980). Metaphors we live by. IL: University of Chicago Press.
  15. Lenca, P. (2008). On selecting interestingness measures for association rules: User oriented description and multiple criteria decision aid. European journal of operational research, 98(5):1031-9.
  16. Mary, V. (2004). Mesh and specialized terminologies : coverage in the field of molecular biology. Studies in Health Technologies and Informatics, 107(Pt 1):530- 4.
  17. McDonald, R. and Pereira, F. (2005). Identifying gene and protein mentions in text using conditional random fields. BMC Bioinformatics, 6(Suppl 1):S6.
  18. Perez-Iratxeta, C. (2002). Association of genes to genetically inherited diseases using data mining. Nature Genetics, 31:316 - 319.
  19. Seki, K. and Mostafa, J. (2007). Discovering implicit associations between gens and hereditary diseases. Pacific Symposium on Biocomputing, 12:316-327.
  20. Shah, P. K., Perez-Iratxeta, C., Bork, P., and Andrade, M. A. (2003). Information extraction from full text scientific articles: Where are the keywords? BMC Bioinformatics, 4:20.
  21. Sudarshan, A. (1985). Hereditary hemorrhagic telangiectasia and factor viii inhibitor. Southern Medical Journal, 78:623-624.
  22. Tsuruoka, Y., Tsujii, J., and Ananiadou, S. (2008). Facta: a text search engine for finding associated biomedical concepts. Bioinformatics, 24:2559-2560.
  23. Weeber, M. (2005). Online tools to support literature-based discovery in the life sciences. Briefings in Bioinformatics, 6(3):277-286.
  24. Yeh, A. (2005). Biocreative task 1a: gene mention finding evaluation. BMC Bioinformatics, 6(Suppl 1):S2.
  25. Zweigenbaum, P. (2007). Frontiers of biomedical text mining: current progress. Briefings in Bioinformatics, 8(5):358-375.
Download


Paper Citation


in Harvard Style

Rib L., Gavaldà R., Manuel Soria J. and Buil A. (2011). SalamboMiner - A Biomedical Literature Mining Tool for Inferring the Genetics of Complex Diseases . In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011) ISBN 978-989-8425-36-2, pages 143-148. DOI: 10.5220/0003143201430148


in Bibtex Style

@conference{bioinformatics11,
author={Leonor Rib and Ricard Gavaldà and Jose Manuel Soria and Alfonso Buil},
title={SalamboMiner - A Biomedical Literature Mining Tool for Inferring the Genetics of Complex Diseases},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011)},
year={2011},
pages={143-148},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003143201430148},
isbn={978-989-8425-36-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2011)
TI - SalamboMiner - A Biomedical Literature Mining Tool for Inferring the Genetics of Complex Diseases
SN - 978-989-8425-36-2
AU - Rib L.
AU - Gavaldà R.
AU - Manuel Soria J.
AU - Buil A.
PY - 2011
SP - 143
EP - 148
DO - 10.5220/0003143201430148