loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Joo Yong Lee 1 ; Sang Ho Lee 1 and Yanggon Kim 2

Affiliations: 1 School of Computing, Soongsil University, Korea, Republic of ; 2 Computer and Information Sciences, Towson University, United States

ISBN: 978-989-8111-11-1

Keyword(s): Web crawler, Parallel crawler, Scalability, Web database.

Related Ontology Subjects/Areas/Topics: Cloud Computing ; Collaboration and e-Services ; Data Engineering ; e-Business ; Enterprise Information Systems ; Mobile Software and Services ; Ontologies and the Semantic Web ; Services Science ; Software Agents and Internet Computing ; Software Engineering ; Software Engineering Methods and Techniques ; Telecommunications ; Web Services ; Wireless Information Networks and Systems

Abstract: As the size of the Web grows, it becomes increasingly important to parallelize a crawling process in order to complete downloading pages in a reasonable amount of time. This paper presents the design and implementation of an effective parallel web crawler. We first present various design choices and strategies for a parallel web crawler, and describe our crawler’s architecture and implementation techniques. In particular, we investigate the URL distributor for URL balancing and the scalability of our crawler.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 35.172.195.49

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Yong Lee J.; Ho Lee S.; Kim Y. and (2007). SCRAWLER: A SEED-BY-SEED PARALLEL WEB CRAWLER.In Proceedings of the Second International Conference on e-Business - Volume 1: ICE-B, (ICETE 2007) ISBN 978-989-8111-11-1, pages 151-156. DOI: 10.5220/0002108701510156

@conference{ice-b07,
author={Joo {Yong Lee} and Sang {Ho Lee} and Yanggon Kim},
title={SCRAWLER: A SEED-BY-SEED PARALLEL WEB CRAWLER},
booktitle={Proceedings of the Second International Conference on e-Business - Volume 1: ICE-B, (ICETE 2007)},
year={2007},
pages={151-156},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002108701510156},
isbn={978-989-8111-11-1},
}

TY - CONF

JO - Proceedings of the Second International Conference on e-Business - Volume 1: ICE-B, (ICETE 2007)
TI - SCRAWLER: A SEED-BY-SEED PARALLEL WEB CRAWLER
SN - 978-989-8111-11-1
AU - Yong Lee, J.
AU - Ho Lee, S.
AU - Kim, Y.
PY - 2007
SP - 151
EP - 156
DO - 10.5220/0002108701510156

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.