Tomohiro Manabe
Keishi Tajima
Kyoto University, Japan
Subtopic Mining, Hierarchical Heading Structure, Web Search, Search Result Diversification, Search Intent.
Enterprise Information Systems
Recommendation Systems
Software Agents and Internet Computing
We propose methods for generating diversified rankings of subtopics of keyword queries. Our methods are
characterized by their awareness of hierarchical heading structure in documents. The structure consists of
nested logical blocks with headings. Each heading concisely describes the topic of its corresponding block.
Therefore, hierarchical headings in documents reflect the hierarchical topics referred to in the documents.
Based on this idea, our methods score subtopic candidates based on matching between them and hierarchical
headings in documents. They give higher scores to candidates matching hierarchical headings associated to
more contents. To diversify the resulting rankings, every time our methods adopt a candidate with the best
score, our methods exclude the blocks matching the candidate and re-score all remaining blocks and candidates.
According to our evaluation result based on the NTCIR data set, our methods generated significantly
better subtopic rankings than query compl
etion results by major commercial search engines.