Authors:
Hung-Yu Kao
1
;
Chia-Sheng Liu
1
;
1
;
Chia-Chun Shih
2
and
Tse-Ming Tse-Ming
2
Affiliations:
1
National Cheng Kung University, Taiwan
;
2
Innovative Digitech-Enabled Applications & Services Institute (IDEAS), Institute for Information Industry, Taiwan
Keyword(s):
Search engine, Link Analysis, PageRank, Web Graph, Hierarchical Structure, Page Quality.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Knowledge Discovery and Information Retrieval
;
Knowledge-Based Systems
;
Searching and Browsing
;
Soft Computing
;
Symbolic Systems
;
Web Information Systems and Technologies
;
Web Interfaces and Applications
;
Web Mining
Abstract:
In recent years, most part of search engines use link analysis algorithms to measure the importance of web pages. The most famous link analysis algorithm is PageRank algorithm. However, previous researches in recent years have found that there exists an inherent bias against newly created pages in PageRank. In the previous work, a new ranking algorithm called DRank has been proposed to solve this issue. It utilizes the cluster phenomenon of PageRank in a directory to predict the possible importance of pages in the future and to diminish the inherent bias of search engines to new pages. In this paper, we modify the original DRank algorithm to complement the weaker part of DRank which could fail while the number of pages in directory is not enough. In our experiments, the augmented algorithm, i.e., DRank+ algorithm, obtains more accuracy in predicting the importance score of pages at next time stage than the original DRank algorithm. DRank+ not only alleviates the bias of newly created
pages successfully but also reaches more accuracy than Page Quality and original DRank in predicting the importance of newly created pages.
(More)