FINDING THE RIGHT EXPERT - Discriminative Models for Expert Retrieval

Philipp Sorg, Philipp Cimiano

Abstract

We tackle the problem of expert retrieval in Social Question Answering (SQA) sites. In particular, we consider the task of, given an information need in the form of a question posted in a SQA site, ranking potential experts according to the likelihood that they can answer the question. We propose a discriminative model (DM) that allows to combine different sources of evidence in a single retrieval model using machine learning techniques. The features used as input for the discriminative model comprise features derived from language models, standard probabilistic retrieval functions and features quantifying the popularity of an expert in the category of the question. As input for the DM, we propose a novel feature design that allows to exploit language models as features. We perform experiments and evaluate our approach on a dataset extracted from Yahoo! Answers, recently used as benchmark in the CriES Workshop, and show that our proposed approach outperforms i) standard probabilistic retrieval models, ii) a state-of-the-art expert retrieval approach based on language models as well as iii) an established learning to rank model.

References

  1. Agichtein, E., Castillo, C., Donato, D., Gionis, A., and Mishne, G. (2008). Finding high-quality content in social media. In Proceedings of the International Conference on Web Search and Web Data Mining (WSDM), pages 183-194, Palo Alto, California, USA. ACM.
  2. Balog, K., Azzopardi, L., and de Rijke, M. (2009). A language modeling framework for expert finding. Information Processing & Management, 45(1):1-19.
  3. Bian, J., Liu, Y., Agichtein, E., and Zha, H. (2008). Finding the right facts in the crowd: Factoid question answering over social media. In Proceeding of the 17th International Conference on World Wide Web (WWW), pages 467-476, Beijing, China. ACM.
  4. Buckley, C. and Voorhees, E. M. (2004). Retrieval evaluation with incomplete information. In Proceedings of the 27th International Conference on Research and Development in Information Retrieval (SIGIR), pages 25-32, Sheffield. ACM.
  5. Cao, X., Cong, G., Cui, B., Jensen, C. S., and Zhang, C. (2009). The use of categorization information in language models for question retrieval. In Proceeding of the 18th Conference on Information and Knowledge Management (CIKM), pages 265-274, Hong Kong, China. ACM.
  6. Craswell, N., de Vries, A., and Soboroff, I. (2005). Overview of the TREC-2005 enterprise track. In Proceedings of the 14th Text REtrieval Conference (TREC), pages 199-205.
  7. Fang, Y., Si, L., and Mathur, A. P. (2010). Discriminative models of integrating document evidence and document-candidate associations for expert search. In Proceedings of the 33rd International Conference on Research and Development in Infromation Retrieval (SIGIR), pages 683-690, Geneva.
  8. Iftene, A., Luca, B., Carausu, G., and Merchez, M. (2010). Identify experts from a domain of interest. In Notebook Reports of the CLEF Conference, Padua.
  9. Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining (KDD), pages 133-142, Edmonton.
  10. K├╝rsten, J. (2009). Chemnitz at CLEF 2009 Ad-Hoc TEL task: Combining different retrieval models and addressing the multilinguality. In Working Notes of the Annual CLEF Meeting, Corfu.
  11. Page, L., Brin, S., Motwani, R., and Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab.
  12. Quinlan, J. R. (1993). C4. 5: programs for machine learning. Morgan Kaufmann.
  13. Robertson, S. E. and Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the 17th International Conference on Research and Development in Information Retrieval (SIGIR), pages 232- 241, Dublin. Springer.
  14. Savoy, J. (2005). Data fusion for effective european monolingual information retrieval. In Multilingual Information Access for Text, Speech and Images, pages 233- 244. Springer.
  15. Sorg, P., Cimiano, P., Schultz, A., and Sizov, S. (2010). Overview of the cross-lingual expert search (CriES) pilot challenge. In Notebook Reports of the CLEF Conference, Padua.
  16. Surdeanu, M., Ciaramita, M., and Zaragoza, H. (2008). Learning to rank answers on large online QA collections. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), pages 719-727, Columbus, Ohio.
  17. Yimam-Seid, D. and Kobsa, A. (2003). Expert-Finding systems for organizations: Problem and domain analysis and the DEMOIR-Approach. Journal of Organizational Computing and Electronic Commerce, 13(1):1.
Download


Paper Citation


in Harvard Style

Sorg P. and Cimiano P. (2011). FINDING THE RIGHT EXPERT - Discriminative Models for Expert Retrieval . In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 182-191. DOI: 10.5220/0003650501900199


in Bibtex Style

@conference{kdir11,
author={Philipp Sorg and Philipp Cimiano},
title={FINDING THE RIGHT EXPERT - Discriminative Models for Expert Retrieval},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={182-191},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003650501900199},
isbn={978-989-8425-79-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - FINDING THE RIGHT EXPERT - Discriminative Models for Expert Retrieval
SN - 978-989-8425-79-9
AU - Sorg P.
AU - Cimiano P.
PY - 2011
SP - 182
EP - 191
DO - 10.5220/0003650501900199