A Distributed Ranking Algorithm for the iTrust Information Search and Retrieval System

Boyang Peng, L. E. Moser, P. M. Melliar-Smith, Y. T. Chuang, I. Michel Lombera

2013

Abstract

The iTrust system is a decentralized and distributed system for publication, search and retrieval of information over the Internet and the Web, that is designed to make it difficult to censor or filter information. In the distributed ranking algorithm for iTrust presented in this paper, a source node that publishes a document indexes the words in the document and produces a term-frequency table for the document. A requesting node that issues a query and receives a response uses the URL in the response to retrieve the term-frequency table from the source node. The requesting node then uses the term-frequency tables from multiple source nodes and a ranking formula to score the documents with respect to its query. Our evaluations of the distributed ranking algorithm for iTrust demonstrate that the algorithm exhibits stability in ranking documents and that it counters scamming by malicious nodes.

References

  1. Badger, C. M., Moser, L. E., Melliar-Smith, P. M., Lombera, I. M., and Chuang, Y. T. (2012). Declustering the iTrust search and retrieval network to increase trustworthiness. In Proceedings of the 8th International Conference on Web Information Systems and Technologies, pages 312-322.
  2. Chuang, Y. T., Michel Lombera, I., Moser, L. E., and Melliar-Smith, P. M. (2011). Trustworthy distributed search and retrieval over the Internet. In Proceedings of the 2011 International Conference on Internet Computing, pages 169-175.
  3. Cohen, D., Amitay, E., and Carmel, D. (2007). Lucene and Juru at Trec 2007: 1 million queries track. In Proceedings of the 16th Text REtrieval Conference, http://trec.nist.gov/pubs/trec16/papers/ibmhaifa.mq.final.pdf.
  4. Cuenca-Acuna, F. M., Peery, C., Martin, R. P., and Nguyen, T. D. (2003). PlanetP: Using gossiping to build content addressable peer-to-peer information sharing communities. In Proceedings of the 12th Symposium on High Performance Distributed Computing, pages 236-246.
  5. Gnutella (2000). http://gnutella.wego.com/.
  6. Gopalakrishnan, V., Morselli, R., Bhattacharjee, B., Keleher, P., and Srinivasan, A. (2007). Distributed ranked search. In Proceedings of High Performance Computing, LNCS 4873, pages 7-20. Springer.
  7. Hearst, M. A. (1995). TileBars: Visualization of term distribution information in full text information access. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 59-66.
  8. Kalogeraki, V., Gunopulos, D., and Zeinalipour-Yazti, D. (2002). A local search mechanism for peer-to-peer networks. In Proceedings of the Eleventh International Conference on Information and Knowledge Management, pages 300-307.
  9. Lee, D. L., Chuang, H., and Seamons, K. (1997). Document ranking and the vector space model. IEEE Software, 14(2):67-75.
  10. Lucene (2009). http://lucene-apache.org/java/docs/.
  11. Melliar-Smith, P. M., Moser, L. E., Michel Lombera, I., and Chuang, Y. T. (2012). iTrust: Trustworthy information publication, search and retrieval. In Proceedings of the 13th International Conference on Distributed Computing and Networking, LNCS 7219, pages 351- 366. Springer.
  12. Melnik, S., Raghavan, S., Yang, B., and Garcia-Molina, H. (2001). Building a distributed full-text index for the Web. ACM Transactions on Information Systems, 19(3):217-241.
  13. Michel Lombera, I., Moser, L. E., Melliar-Smith, P. M., and Chuang, Y. T. (2013). Mobile decentralized search and retrieval using SMS and HTTP. ACM Mobile Networks and Applications Journal, 18(1):22-41.
  14. Page, L., Brin, S., Motwani, R., and Winograd, T. (1998). The PageRank citation ranking: Bringing order to the Web. In Technical Report, Stanford University Database Group.
  15. Perez-Iglesias, J., Perez-Aguera, J. R., Fresno, V., and Feinstein, Y. Z. (2009). Integrating the probabilistic model BM25/BM25F into Lucene. In arXiv preprint arXiv:0911.5046v2 [cs.IR].
  16. Shi, S., Yu, J., Yang, G., and Wang, D. (2003). Distributed page ranking in structured P2P networks. In Proceedings of the 2003 International Conference on Parallel Processing, pages 179-186.
  17. Yuwono, B. and Lee, D. L. (1997). Server ranking for distributed text retrieval systems on the Internet. In Proceedings of the Fifth International Conference on Database Systems for Advanced Applications, pages 41-50.
Download


Paper Citation


in Harvard Style

Peng B., Moser L., Melliar-Smith P., Chuang Y. and Michel Lombera I. (2013). A Distributed Ranking Algorithm for the iTrust Information Search and Retrieval System . In Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST, ISBN 978-989-8565-54-9, pages 199-208. DOI: 10.5220/0004355601990208


in Bibtex Style

@conference{webist13,
author={Boyang Peng and L. E. Moser and P. M. Melliar-Smith and Y. T. Chuang and I. Michel Lombera},
title={A Distributed Ranking Algorithm for the iTrust Information Search and Retrieval System},
booktitle={Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,},
year={2013},
pages={199-208},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004355601990208},
isbn={978-989-8565-54-9},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST,
TI - A Distributed Ranking Algorithm for the iTrust Information Search and Retrieval System
SN - 978-989-8565-54-9
AU - Peng B.
AU - Moser L.
AU - Melliar-Smith P.
AU - Chuang Y.
AU - Michel Lombera I.
PY - 2013
SP - 199
EP - 208
DO - 10.5220/0004355601990208