Amin Mantrach, Jean-Michel Renders


The growing importance of social media and heterogeneous relational data emphasizes to the fundamental problem of combining different sources of evidence (or modes) efficiently. In this work, we are considering the problem of people retrieval where the requested information consists of persons and not of documents. Indeed, the processed queries contain generally both textual keywords and social links while the target collection consists of a set of documents with social metadata. Traditional approaches tackle this problem by early or late fusion where, typically, a person is represented by two sets of features: a word profile and a contact/link profile. Inspired by cross-modal similarity measures initially designed to combine image and text, we propose in this paper new ways of combining social and content aspects for retrieving people from a collection of documents with social metadata. To this aim, we define a set of multimodal similarity measures between socially-labelled documents and queries, that could then be aggregated at the person level to provide a final relevance score for the general people retrieval problem. Then, we examine particular instances of this problem: author retrieval, recipient recommendation and alias detection. For this purpose, experiments have been conducted on the ENRON email collection, showing the benefits of our proposed approach with respect to more standard fusion and aggregation methods.


