BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION

Hasan Davulcu, Hung V. Nguyen, Viswanathan Ramachandran

Abstract

Most search engines do their text query and retrieval based on keyword phrases. However, publishers cannot anticipate all possible ways in which users search for the items in their documents. In fact, many times, there may be no direct keyword match between a search phrase and descriptions of items that are perfect “hits” for the search. We present a highly automated solution to the problem of bridging the semantic gap between item information and search phrases. Our system can learn rule-based definitions that can be ascribed to search phrases with dynamic connotations by extracting structured item information from product catalogs and by utilizing a frequent itemset mining algorithm. We present experimental results for a realistic e-commerce domain. Also, we compare our rule-mining approach to vector-based relevance feedback retrieval techniques and show that our system yields definitions that are easier to validate and perform better.

References

  1. R. Agrawal and R. Srikant. 1994, “Fast Algorithms for mining association rules”. In Proc. 20th Int. Conf. VLDB pp. 487-499
  2. H. Aholen, O. Heinonen, M. Klemettinen, and A. I. Verkamo. 1998, “Applying Data Mining Techniques for
  3. Descriptive Phrase Extraction in Digital Collections”. In Proceedings of ADL'98, Santa Barabara, USA
  4. W. Andrews. 2003 “Gartner Report: Visionaries Invade the 2003 Search Engine Magic Quadrant”,
  5. V. Crescenzi, G. Mecca, and P. Merialdo. 2001 “Roadrunner: Towards automatic data extraction from large web sites”, In Proc. of the 2001 Intl. Conf. on Very Large Data Bases.
  6. H. Davulcu, S. Vadrevu, S. Nagarajan, I.V. Ramakrishnan. 2003, “OntoMiner: Bootstrapping and Populating Ontologies From Domain Specific Web Sites”, in IEEE Intelligent Systems, Volume 18, Number 5.
  7. Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. 1990, “Indexing. Latent semantic analysis”, journal of the Society for Information Science, 41(6), pp. 391-407.
  8. Steve Finch and Andrei Mikheev. 1997, “A Workbench for Finding Structure in Texts”. Applied Natural Language Processing , Washington D.C.
  9. J. Han J.Pei, Y.Yin, and R. Mao. 2000, “Mining frequent pattern without candidate generation.” In Proceedings of the ACM SIGMOD International Conference on Management of Data, volume 29(2) of SIGMOD Record, ACM Press.
  10. J. Han, and M. Kamber. 2001, “Data Mining: Concepts and Techniques”, Morgan Kaufmann Publishers.
  11. Hung V. Nguyen, P. Velamuru, D. Kolippakkam, H. Davulcu, H. Liu, and M. Ates. 2003, “Mining "Hidden Phrase" Definitions from the Web”. APWeb, Xi'an, China, Springer-Velag, LNCS Vol 2642, pp. 156-165.
  12. M.F.Porter. 1980, “An algorithm for suffix stripping”, Program, 14 no. 3, pp. 130-137.
  13. G. Salton and C. Buckley. 1990, “Improving retrieval performance by relevance feedback”, journal of the American Society for Information Science, pp. 288- 297.
  14. Ellen M. Voorhees. 1998, “Using WordNet for Text Retrieval”. In WordNet: An Electronic Lexical Database, Edited by Christiane Fellbaum, MIT Press.
  15. R. A. Baeza-Yates and Berthier A. Ribeiro-Neto. 1999, “Modern Information Retrieval”, ACM Press / Addison-Wesley.
  16. M.J. Zaki. 2000, “Scalable algorithms for association mining”. IEEE Transactions on Knowledge and Data Engineering, 12(3), pp. 372-390.
  17. itteenm11020000
  18. cum800
  19. fod 600
  20. eb 400
  21. uN 200 0
Download


Paper Citation


in Harvard Style

Davulcu H., V. Nguyen H. and Ramachandran V. (2005). BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION . In Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 4: ICEIS, ISBN 972-8865-19-8, pages 48-55. DOI: 10.5220/0002525800480055


in Bibtex Style

@conference{iceis05,
author={Hasan Davulcu and Hung V. Nguyen and Viswanathan Ramachandran},
title={BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION},
booktitle={Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 4: ICEIS,},
year={2005},
pages={48-55},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002525800480055},
isbn={972-8865-19-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Seventh International Conference on Enterprise Information Systems - Volume 4: ICEIS,
TI - BOOSTING ITEM FINDABILITY: BRIDGING THE SEMANTIC GAP BETWEEN SEARCH PHRASES AND ITEM INFORMATION
SN - 972-8865-19-8
AU - Davulcu H.
AU - V. Nguyen H.
AU - Ramachandran V.
PY - 2005
SP - 48
EP - 55
DO - 10.5220/0002525800480055