Improvement of n-ary Relation Extraction by Adding Lexical Semantics to Distant-Supervision Rule Learning

Hong Li, Sebastian Krause, Feiyu Xu, Andrea Moro, Hans Uszkoreit, Roberto Navigli

2015

Abstract

A new method is proposed and evaluated that improves distantly supervised learning of pattern rules for n-ary relation extraction. The new method employs knowledge from a large lexical semantic repository to guide the discovery of patterns in parsed relation mentions. It extends the induced rules to semantically relevant material outside the minimal subtree containing the shortest paths connecting the relation entities and also discards rules without any explicit semantic content. It significantly raises both recall and precision with roughly 20% f-measure boost in comparison to the baseline system which does not consider the lexical semantic information.

References

  1. Agichtein, E. (2006). Confidence estimation methods for partially supervised information extraction. In Proc. of the Sixth SIAM International Conference on Data Mining.
  2. Alfonseca, E., Filippova, K., Delort, J.-Y., and Garrido, G. (2012). Pattern learning for relation extraction with a hierarchical topic model. In Proc. of ACL (2), pages 54-59.
  3. Appelt, D. E. and Israel, D. J. (1999). Introduction to information extraction technology. A tutorial prepared for IJCAI-99.
  4. Banko, M. and Etzioni, O. (2008). The Tradeoffs Between Open and Traditional Relation Extraction. In Proc. of ACL/HLT, pages 28-36.
  5. Bollacker, K. D., Evans, C., Paritosh, P., Sturge, T., and Taylor, J. (2008). Freebase: a collaboratively created graph database for structuring human knowledge. In Proc. of SIGMOD, pages 1247-1250.
  6. Bond, F. and Kyonghee, P. (2012). A survey of wordnets and their licenses. In Proceedings of the 6th International Global WordNet Conference, pages 64-71.
  7. Bunescu, R. C. and Mooney, R. J. (2005). A Shortest Path Dependency Kernel for Relation Extraction. In Proc. of HLT, pages 724-731.
  8. Chowdhury, M. F. M. and Lavelli, A. (2012). Combining tree structures, flat features and patterns for biomedical relation extraction. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 7812, pages 420-429, Stroudsburg, PA, USA. Association for Computational Linguistics.
  9. Etzioni, O., Fader, A., Christensen, J., Soderland, S., and Mausam (2011). Open Information Extraction: The Second Generation. In Proc. of IJCAI, page 310.
  10. Fader, A., Soderland, S., and Etzioni, O. (2011). Identifying Relations for Open Information Extraction. In Proc. of EMNLP, page 15351545.
  11. Fellbaum, C. (1998). WordNet: An Electronic Lexical Database. MIT Press.
  12. Grishman, R. and Sundheim, B. (1996). Message understanding conference - 6: A brief history. In Proc. of the 16th International Conference on Computational Linguistics, Copenhagen.
  13. Grishman, R., Westbrook, D., and Meyers, A. (2005). Nyu's english ace 2005 system description. Technical report, Proteus Project, Department of Computer Science, New York University.
  14. Jean-Louis, L., Besanon, R., Ferret, O., and Durand, A. (2013). Using Distant Supervision for Extracting Relations on a Large Scale. In Fred, A., Dietz, J., Liu, K., and Filipe, J., editors, Knowledge Discovery, Knowledge Engineering and Knowledge Management, volume 348 of Communications in Computer and Information Science, page 141155. Springer Berlin Heidelberg.
  15. Krause, S., Li, H., Uszkoreit, H., and Xu, F. (2012). Largescale learning of relation-extraction rules with distant supervision from the web. In Proc. of 11th ISWC, Part I, pages 263-278.
  16. Li, H., Krause, S., Xu, F., Uszkoreit, H., Hummel, R., and Mironova, V. (2014). Annotating relation mentions in tabloid press. In Proceedings of the 9th edition of the Language Resources and Evaluation Conference.
  17. Mausam, Schmitz, M., Soderland, S., Bart, R., and Etzioni, O. (2012). Open Language Learning for Information Extraction. In Proc. of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 523-534, Jeju Island, Korea. Association for Computational Linguistics.
  18. Min, B., Grishman, R., Wan, L., Wang, C., and Gondek, D. (2013). Distant supervision for relation extraction with an incomplete knowledge base. In Proceedings of NAACL-HLT, pages 777-782.
  19. Mintz, M., Bills, S., Snow, R., and Jurafsky, D. (2009). Distant supervision for relation extraction without labeled data. In Proc. of ACL/AFNLP, page 10031011.
  20. Moro, A., Li, H., Krause, S., Xu, F., Navigli, R., and Uszkoreit, H. (2013). Semantic rule filtering for web-scale relation extraction. In International Semantic Web Conference (1), pages 347-362.
  21. Moro, A. and Navigli, R. (2013). Integrating syntactic and semantic analysis into the open information extraction paradigm. In Proc. of IJCAI, pages 2148-2154.
  22. Moro, A., Raganato, A., and Navigli, R. (2014). Entity linking meets word sense disambiguation: A unified approach. Transactions of the Association for Computational Linguistics, 2:231-244.
  23. Navigli, R. (2009). Word Sense Disambiguation: A survey. ACM Comput. Surv., 41(2):1-69.
  24. Navigli, R. and Ponzetto, S. P. (2012). BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193:217-250.
  25. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., and Marsi, E. (2007). Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(2):95-135.
  26. Ravichandran, D. and Hovy, E. H. (2002). Learning surface text patterns for a Question Answering System. In Proc. of ACL, pages 41-47.
  27. Wu, F. and Weld, D. S. (2010). Open information extraction using wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 118-127. Association for Computational Linguistics.
  28. Xu, F., Uszkoreit, H., and Li, H. (2007). A seed-driven bottom-up machine learning framework for extracting relations of various complexity. In Proc. of ACL.
  29. Xu, H., Hu, C., and Shen, G. (2009). Discovery of dependency tree patterns for relation extraction. In PACLIC, pages 851-858.
  30. Xu, Y., Kim, M.-Y., Quinn, K., Goebel, R., and Barbosa, D. (2013). Open Information Extraction with Tree Kernels. In Proc. of NAACL-HLT, pages 868-877, Atlanta, Georgia. Association for Computational Linguistics.
  31. Yangarber, R., Grishman, R., and Tapanainen, P. (2000). Automatic acquisition of domain knowledge for information extraction. In Proc. of COLING, pages 940-946.
  32. Zelenko, D., Aone, C., and Richardella, A. (2003). Kernel methods for relation extraction. The Journal of Machine Learning Research, 3:1083-1106.
Download


Paper Citation


in Harvard Style

Li H., Krause S., Xu F., Moro A., Uszkoreit H. and Navigli R. (2015). Improvement of n-ary Relation Extraction by Adding Lexical Semantics to Distant-Supervision Rule Learning . In Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART, ISBN 978-989-758-074-1, pages 317-324. DOI: 10.5220/0005187303170324


in Bibtex Style

@conference{icaart15,
author={Hong Li and Sebastian Krause and Feiyu Xu and Andrea Moro and Hans Uszkoreit and Roberto Navigli},
title={Improvement of n-ary Relation Extraction by Adding Lexical Semantics to Distant-Supervision Rule Learning},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,},
year={2015},
pages={317-324},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005187303170324},
isbn={978-989-758-074-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Agents and Artificial Intelligence - Volume 2: ICAART,
TI - Improvement of n-ary Relation Extraction by Adding Lexical Semantics to Distant-Supervision Rule Learning
SN - 978-989-758-074-1
AU - Li H.
AU - Krause S.
AU - Xu F.
AU - Moro A.
AU - Uszkoreit H.
AU - Navigli R.
PY - 2015
SP - 317
EP - 324
DO - 10.5220/0005187303170324