MINING NON-TAXONOMIC CONCEPT PAIRS FROM UNSTRUCTURED TEXT - A Concept Correlation Search Framework

Mei Kuan Wong, Syed Sibte Raza Abidi, Ian D. Jonsen

Abstract

Ontology consists of concepts, taxonomic relations and non-taxonomic relations. The majority of the ontology learning tools focus on discovering concepts and taxonomic relations. Very little effort has been put on discovering non-taxonomic relations. In this paper, we present a concept correlation search framework to discover non-taxonomic concept pairs from unstructured text. Our framework features the (a) extraction of correlated concepts beyond ordinary search window size of a single sentence; (b) use of lift as interestingness measure for association rule mining; (c) harness of 2- itemsets association rules from n- itemsets association rules where n>2; and (d) identification of non-taxonomic concept pairs based on existing domain ontology. The proposed framework has been tested with the Fisheries Oceanography journals, and the results demonstrate significant improvements over traditional association rule approach in search of non-taxonomic concept pairs.

References

  1. Agrawal, R., Imielinski, T. & Swami, A. 1993, "Mining association rules between sets of items in large databases", ACM SIGMOD Record, vol. 22, no. 2, pp. 207-216.
  2. Alvarez, S. A. 2003, "Chi-squared computation for association rules: Preliminary results", Comput.Sci.Dept., Boston College, Chestnut Hill, MA, Tech.Rep.BC-CS-2003-01.
  3. Chagnoux, M., Hernandez, N. & Aussenac-Gilles, N. 2008, "An interactive pattern based approach for extracting non-taxonomic relations from texts", Workshop on Ontology Learning and Population (associated to ECAI 2008)(OLP), University of Patras, Patras, Greece, pp. 1-6.
  4. Cimiano, P., Pivk, A., Schmidt-Thieme, L. & Staab, S. 2005, "Learning taxonomic relations from heterogeneous sources of evidence", Ontology Learning from Text: Methods, evaluation and applications, pp. 59-73.
  5. Gulla, J. A., Brasethvik, T. & Kvarv, G. S. 2009, "Association Rules and Cosine Similarities in Ontology Relationship Learning", Enterprise Information Systems, pp. 201-212.
  6. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. & Witten, I.H. 2009, "The WEKA data mining software: An update", ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10-18.
  7. Maedche, A. & Staab, S. 2000, "Discovering conceptual relations from text", ECAICiteseer, pp. 321.
  8. Sánchez, D. & Moreno, A. 2008, "Learning nontaxonomic relationships from web documents for domain ontology construction", Data & Knowledge Engineering, vol. 64, no. 3, pp. 600-623.
  9. Schutz, A. & Buitelaar, P. 2005, RelExt: A Tool for Relation Extraction from Text in Ontology Extension.
  10. Shamsfard, M. & Barforoush, A. A. 2003, "The state of the art in ontology learning: a framework for comparison", The Knowledge Engineering Review, vol. 18, no. 04, pp. 293-316.
  11. Shamsfard, M. & Barforoush, A. A. 2004, "Learning ontologies from natural language texts", International Journal of Human-Computer Studies, vol. 60, no. 1, pp. 17-63.
  12. Sheikh, L., Tanveer, B. & Hamdani, M. 2005, "Interesting measures for mining association rules", Multitopic Conference, 2004. Proceedings of INMIC 2004. 8th International IEEE, pp. 641.
  13. Velardi, P., Navigli, R., Cucchiarelli, A., Neri, F., Buitelaar, P., Cimiano, P. & Magnini, B. 2005, "Evaluation of OntoLearn, a methodology for automatic learning of domain ontologies", Ontology Learning from Text: Methods, evaluation and applications, pp. 92-106.
  14. Witten, I. H., Paynter, G. W., Frank, E., Gutwin, C. & Nevill-Manning, C. G. 1999, "KEA: Practical automatic keyphrase extraction", Proceedings of the 4th ACM conference on Digital libraries ACM, pp. 254.
Download


Paper Citation


in Harvard Style

Kuan Wong M., Sibte Raza Abidi S. and D. Jonsen I. (2011). MINING NON-TAXONOMIC CONCEPT PAIRS FROM UNSTRUCTURED TEXT - A Concept Correlation Search Framework . In Proceedings of the 7th International Conference on Web Information Systems and Technologies - Volume 1: WTM, (WEBIST 2011) ISBN 978-989-8425-51-5, pages 707-716. DOI: 10.5220/0003482707070716


in Bibtex Style

@conference{wtm11,
author={Mei Kuan Wong and Syed Sibte Raza Abidi and Ian D. Jonsen},
title={MINING NON-TAXONOMIC CONCEPT PAIRS FROM UNSTRUCTURED TEXT - A Concept Correlation Search Framework},
booktitle={Proceedings of the 7th International Conference on Web Information Systems and Technologies - Volume 1: WTM, (WEBIST 2011)},
year={2011},
pages={707-716},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003482707070716},
isbn={978-989-8425-51-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Conference on Web Information Systems and Technologies - Volume 1: WTM, (WEBIST 2011)
TI - MINING NON-TAXONOMIC CONCEPT PAIRS FROM UNSTRUCTURED TEXT - A Concept Correlation Search Framework
SN - 978-989-8425-51-5
AU - Kuan Wong M.
AU - Sibte Raza Abidi S.
AU - D. Jonsen I.
PY - 2011
SP - 707
EP - 716
DO - 10.5220/0003482707070716