PEER-TO-PEER NETWORK SIMULATION
Nyik San Ting, Ralph Deters
Departmentr of Computer Science, University of Sasjatchewan, 57 Campus Drive, Sasktaoon, 5A9S7 Canada
K
eywords: Peer-to-Peer Networks, Simulation
Abstract: Peer-to-Peer (p2p) networks are the latest addition to the already large distributed systems family. With a
strong emphasis on self-organization, decentralization and autonomy of the participating nodes, p2p-
networks tend to be more scalable, robust and adaptive than other forms of distributed systems. The much-
publicized success of p2p-networks for file-sharing and cycle-sharing has resulted in an increased awareness
and interest into the p2p protocols and applications. However, p2p-networks are difficult to study due to
their size and the complex interdependencies between users, application, protocol and network. This paper
has two aims. First, to provide a review of existing p2p-network simulators and to make a case for our own
simulator named 3LS (3-Level-Simulator). Second, it presents our current view that there is a need for more
realistic/complex models in p2p-network simulation since ignoring the underlying network, topology and/or
the behaviour of applications can result in misleading simulation results.
1 INTRODUCTION
The field of P2P networks is still undergoing major
changes with new applications and protocols
emerging on a nearly monthly basis. However, due
to the difficulties in evaluating them prior to their
large-scale deployments, they are often short-lived –
disappearing as fast as they emerge – normally due
to bad performance. What works well in a controlled
lab environment, using a small number of nodes,
high bandwidth, low latency and highly cooperative
users often fails in real world deployments due to
lack of bandwidth, heterogeneity and uncooperative
users.
Testing a system’s performance prior to its
deployment is a fairly common element in the
software development of applications. Pawlikowski
(Pawlikowski, 2002) identifies two main possible
experimentation streams: experimentation with the
actual system and experimentation with a model
(physical or abstract) of the system.
P2P networks tend to be large, heterogeneous
systems with complex interactions between the
physical machines, underlying network, application
and users. Hence, testing of a “running” p2p-
network or protocol in a realistic environment is
often not feasible. However, it is possible to use a
simulation of a p2p-network to evaluate the
applications and protocols in controlled
environment. Since a simulation requires the
creation/definition of a model that serves as an
abstraction of the real p2p-network, the issues of
model-scope and model-detail arise. What should be
included and what can be ignored? Can the physical
network be ignored? Is the topology of the p2p-
network important? And last but not least, what
workloads should be chosen? Especially with the
emergences of more complex p2p protocols and
applications that emphasize self-organization due to
network, loads or past experiences, this issue is
becoming more important. Consequently, a p2p
simulation model needs to reflect the dependencies
between the users, application, p2p-protocol, p2p-
network topology and physical network.
The remaining part of this paper is structured as
follows. Section 2 reviews existing p2p-network
simulators. Based on the shortcomings of existing
simulators section 3 presents a novel multi-level
simulator called 3LS (3-Level-Simulator). This
simulator is evaluated in section 4 and the paper
concludes with a summary in section 5.
2 P2P NETWORK SIMULATORS
As mentioned earlier, p2p-network simulators are
one option in studying the behaviour of P2P
systems. To date only a few p2p simulators have
been implemented to aid the research on application
and protocols development.
84
San Ting N. and Deters R. (2004).
PEER-TO-PEER NETWORK SIMULATION.
In Proceedings of the Sixth International Conference on Enterprise Information Systems, pages 84-91
DOI: 10.5220/0002624800840091
Copyright
c
SciTePress
2.1 Evaluation Criteria
There are two necessary conditions for obtaining
credible results from a simulator: using a valid
simulation program and executing a valid simulation
experiment. “A simulation program is valid, if it is a
verified computer program of a valid simulation
model” (Pawlikowski, 2002). A validated simulation
model is a model that is a satisfying accurate
approximation of the system under study.
Table 1 contains the criteria considered important
in evaluating a P2P simulator.
Table 1: Criteria
Criteria Definitions
Usability How easy are the use of the
simulator, preparation of input
data and the extraction of
simulator data?
Extensibility How easy can the simulator be
extended to handle modified/new
models e.g. new p2p protocol?
Configurability How easy is the customization of
the simulation?
Interoperability Can the simulator be used to
interoperate with other
application?
2.2 Serapis
Serapis (Sandberg, Serapis, 2001) is possibly one of
the earliest p2p-network simulators. It was designed
to allow the evaluation of different caching
algorithms for the FreeNet protocol. Serapis has
been extended over time to simulate the Gnutella
protocol, however, the work is halted and the
extension is not yet completed (the latest update was
made in November 2001). Serapis focused on the
“static network designs with different connectivity
patterns and routing algorithms” (Joseph, 2001). It
has been shown that the results of simulations on
FreeNet obtained using Serapis were inaccurate and
that the simulator fails to simulate the actual stresses
and strains of a live deployment in an accurate
manner (Joseph, FreeNet, 2001).
2.3 NeuroGrid Simulator
NeuroGrid (Joseph, 2001) (Neurogrid, 2003) is a
p2p-network simulator designed to study search
operations for FreeNet, Gnutella and the NeuroGrid
protocols. The NeuroGrid simulator is a single-
threaded discrete event simulator that can be
customized using configuration/properties files.
Among the parameters users can control are, the
type of protocol to simulate, the number of searches
to simulate and the preferred user interface. The
simulation data (e.g. number of messages parsed and
the states of the simulation) can be saved into files
for later analysis. NeuroGrid assumes that the
distance between nodes is constant and ignores
latency, congestion and bandwidth issues. After a
search message is sent out, the nodes that received
the messages take turns in forwarding the messages
(due to the single-threaded design of the simulator).
After the current search message has been served is
it possible to launch another new message
(sequential execution of search requests).
NeuroGrid is still very much a work in progress
and efforts are made to improve the level of detail in
the network models. In its most recent form, the
simulator enables the users to specify the number of
nodes to simulate (this is also the number of nodes
to add to the current simulation after a number of
searches is done), the initial number of connections
for each node, the number of searches to be
generated, and the initial network topology (only
ring or at random networks).
Since NeuroGrid is designed to simulate the
searching algorithms of different protocols, it is
necessary to let the user specify the number of
keywords used for the simulation, the number of the
documents used for the simulation, the number of
keywords per document and the number of
documents stored on each node (document and
keyword assignments are all randomized). The latest
release of NeuroGrid (version 0.1.4) has included
the simulation of resource-limited nodes and
introduced the concept of the dishonest node.
The simulator can be customized to needs of a
user-defined p2p-application based on the above-
mentioned protocols by extending the classes
provided.
2.4 FreeNet Simulator
The FreeNet simulator (Pfeifer, 2002) was designed
to analyze different caching algorithms for the
FreeNet protocol. It uses a two-steps mechanism to
support the event handling allowing multiple
messages to be sent at a time. In the first step, the
simulator will move the messages from a temporary
storage space to a queue of the node that will
process them in the next iteration. In the second step,
the simulator will process the messages queued at
each node (previous iteration) and put the newly
generated messages in the temporary storage space.
PEER-TO-PEER NETWORK SIMULATION
85
With this design, all the nodes act synchronously
without mixing the newly arrived messages and the
old messages. The user can modify the factors for
the simulation by manipulating an interface class
that is implemented by the other classes. This design
requires that the simulator source code be
recompiled each time after the parameters of the
simulation have been changed (!).
The user can adjust the maximum number of
nodes to simulate, the TTL of the messages, the type
of nodes to be simulated (the caching algorithm), the
probability of a request event for a file at each node
and the probability of faulty information insertion by
the node. Unfortunately, the handshaking between
nodes is not implemented in the current version of
the simulator.
In order to initialise a network, the simulator is
started with three connected nodes and new nodes
are added iterative until the user-defined number of
nodes to simulate is reached. The simulator allows
nodes only to be added assuming a static network.
After the network is initialised with the number of
nodes desired, the desired number of files will be
inserted into the network. After this the simulator
will be started with initiating the request events.
Output of the simulator such as number of
attempted and successful actions for file-insertions
and searches are printed onto the command-line.
2.5 FreePastry
FreePastry (FreePastry) is an open-source
implementation of the Pastry protocol in Java that
can be used to emulate a Pastry network. The latest
release of FreePastry (January 28, 2003) includes the
implementation of the PAST (Rowstron, 2001)
archival storage system that is based on the Pastry
protocol and an implementation of the Scribe
(Castro, 2002) group communication infrastructure.
The settings of the FreePastry parameters, such
as the number of nodes to simulate and the number
of events to generate, is done by providing the
values in the command line upon starting the
simulator. The results are displayed on the command
prompt screen as the messages are being processed.
Since the Pastry routing uses proximity metrics, it is
necessity to represent the proximity in the
simulation. Random, Euclidian and sphere are
currently supported in FreePastry. In the Euclidean
network topology the nodes are randomly placed in
a Euclidean plane and the proximity is based on the
Euclidean distance in the plane. Whereas, in the
Sphere Network topology, the nodes are randomly
placed on a sphere, and the proximity is based on the
Euclidean distance on the sphere. However, the
network delay for the message passing is not
simulated, as the simulator is not designed to
simulate time.
2.6 Summary
Current p2p-network simulators are limited in their
use, difficult to customize and generally tend to
ignore the physical network and the user behaviour.
The simulators do not support the customisation of
the initial network state (connections between the
simulated computers and the network delay) and are
limited in the level of detail and the scalability of the
supported models. Furthermore, the simulators are
mostly focusing on the caching algorithms and
ignoring the fact that other activities can also impact
the efficiency of the system.
While the NeuroGrid simulator is providing good
network visualization it does not simulate the user
events, network latency and the processor delay of
the nodes. Hence the simulation does not reflect the
real world situation, especially with the serial
searches functionality. Another tricky issue in using
of FreePastry and NeuroGrid is the serial fashion in
which they execute search events. Due to the
absence of a GUI in the FreeNet and FreePastry
simulators the modification of parameters is
cumbersome. Some of the settings are made through
the command line and some have to be encoded in
the program.
The FreeNet simulator supports synchronous
actions of nodes but fails in providing support for
modeling the network latency and the user’s
behavior. In addition the concept of recompilation
after changing is rather crude and limits the use
significantly.
Adapting the simulator to new or modified
protocols is a question of great practical importance.
NeuroGrid and FreeNet do not simulate the network
overlay, and hence it is hard to extend the simulation
to handle new protocols that need the network
proximity information, e.g. the Pastry protocol.
On the other hand, though FreePastry is focused on
the Pastry protocol, the simulator is more decoupled
and the code can be reused and extended to
implement a different protocol e.g. Gnutella.
However, the serialized event handling and the
absence of simulation time make it difficult to
simulate the network delay and processor delay.
ICEIS 2004 - SOFTWARE AGENTS AND INTERNET COMPUTING
86
Table 2: Evaluating existing P2P Simulators
Neuro-
Grid
FreeNet FreePastry
Event-
processing
Serial Parallel Serial
Usability Very
easy
Medium Hard
Extensibility Medium High Medium
Configurability
(Easiness)
Mid-
High
(High)
Medium
(Medium)
Low
(Mid-Low)
Interoperability Medium Medium High
Level of Detail Medium High High
Build-ability Medium High Very High
Simulating
User behavior
No No No
Simulating
Computer
Hardware
No No No
Simulating
network
overlay
No No Yes
Simulating
time
(Network
delay)
No
(No)
Yes
(No)
No
(No)
It is therefore our conclusion that none of the
existing simulators are suited for realistic
simulations of different p2p-networks and that the
development of a new simulator is therefore
justified.
3 TOWARDS AN OPEN
MULTILEVEL P2P SIMULATOR
Researchers, who wanted to simulate a p2p system,
tend to avoid the development of a complex
simulator and focus on some selected areas (such as
caching schemes). While some may choose to start
an implementation from scratch, an increasing
number of researchers build their simulators on top
of existing tools, e.g. the agent platform JADE
(Bellifemine, 1999), to speed-up the development.
The general problem of having only special-purpose
simulators is that the results obtained with one
simulator are difficult to validate and often
impossible to achieve with another simulator due to
the many hard-coded assumptions of every
simulator.
This section presents an architecture and
implementation of an open p2p simulator, called
3LS (3-Level-Simulator), designed to overcome the
problems of existing simulators namely,
extensibility, usability and level of detail. Since the
development of a simulator is a complex and time-
consuming activity we haven’t completed all parts of
the simulator and in its current version it supports
only Gnutella-style protocols. The simulator is
currently used to help in the development of
Comtella (Vassileva, 2002) a p2p file-sharing
applications for sharing research papers.
3.1 Design Goals
The criteria used to evaluate the simulators in
section two were used as a guideline for the design
of a more generic and open simulator that will allow
users to define models for the physical network, the
physical machines, p2p-network topology, p2p
protocol, p2p application and user behavior.
3.2 Architecture
Figure 1 shows a high-level view of the 3LS
simulator. 3LS is a time-stepped simulator that uses
a central step-clock. In 3LS the models for network,
p2p protocol and user model are clearly separated.
This separation allows the simulation of various
network topologies, for different protocol,
applications and user models. To achieve this
separation three levels have been defined:
Network level (bottom),
Protocol level (middle) and
User level (top).
Upon starting the simulator it is possible to either
create the models for the three described levels (fig.
2) or to choose among a library the ones most suited
for the simulation run. As the simulation is running,
the events are displayed on the command prompt
screen. After the simulation has been completed, all
simulation data is saved into a file for future
analysis.
As Pawlikowski points out, either general-
purpose languages (such as FORTRAN, Pascal, C
and Java) or simulation languages (such as GPSS,
SIMAN or SLAM II) can be used for the
“translation of the model into a computer program”
(Pawlikowski, 2002). Though simulation languages
provide most of the features needed in programming
a simulation model and the details of the simulation
models can be easily changed, a general-purpose
language was selected to provide “greater
programming flexibility”.
PEER-TO-PEER NETWORK SIMULATION
87
Since Java is the preferred language of many p2p
programmers it was chosen as the host-language for
the 3LS simulator.
Visualization of the network is done with the aid
of the tool AiSee (AbsInt). AiSee was selected for
its, ease of use, simple installation, availability (runs
under various OS), functionality and performance in
rendering. When screenshots of the p2p-network are
to be visualized, files containing the information of
the graph are created by 3LS using the Graph
Description Language (GDL).
Once the file is created a user can use AiSee to
render an image of the graph (see figure 3).
3.3 Network Level
Using the GUI (fig.3) or predefined scenarios, the
network level creates a two-dimensional matrix
storing the distance values between the nodes. The
network level is responsible for modeling the user-
defined aspects of a physical network e.g. varying
network load due to increased P2P communication.
The network-level also creates a user-defined
number, of computer nodes with a user defined-
number of worker threads associated that take care
Figure 1: Architecture of the 3LS
Figure 2:
Screenshot of 3LS
Network
-
Model
ICEIS 2004 - SOFTWARE AGENTS AND INTERNET COMPUTING
88
of the messages passing and the processing of the
messages at application node. By varying the
number and the priority of the worker threads the
processor load and simulation speed of the tool can
be adjusted. Each node in the network level
represents a computer with user-defined hardware
specification that is used to simulate in a more
manner accurate individual machines. Interactions
between the network level and the protocol level are
made through referencing the application nodes in
the protocol-level. Each node used four queues for
handling message objects:
Outbox,
Inbox-For-Network-Delay,
Inbox-For-Processor-Delay and
Inbox.
To illustrate the way the nodes between different
layers work, a simple example of sending messages
between nodes of the protocol layer will be used.
Sending a message from one application node (X) to
another one (Y) is done as follows:
i. A message object is created, time-stamped
and placed by the application node
(protocol layer) in the outbox of the
computer node X (network layer). A
worker thread responsible for checking the
outboxes will detect the message and move
it from the outbox of node X into the inbox-
for-network-delay of node Y.
ii. After checking the outboxes of all nodes
the global simulation clock is incremented
and the worker threads start checking the
message objects in the inbox-for-network-
delay using the 2-D distance matrix and the
congestion network delay data. The goal is
to simulate the delay/latency of the network
by postponing the delivery of message. If
the assumed network delay has been
fulfilled, the message will be stored in the
inbox-for-processor-delay that serves as a
means to simulate the time a node needs for
processing the message.
iii. After the simulation clock is incremented,
the worker threads will look at the
processor delay of the node Y and check
whether the message object in the inbox-
for-processor-delay can be moved into the
inbox for the application node. In case the
processor delay has not been reached, the
message object remains in the inbox-for-
processor-delay.
iv. Once the delay has passed the message
object will be sent to the appropriate
application node. After the application node
processed the received message object, it
will perform user-defined responses. If the
response results in the creation of new
messages they are stored in the outbox and
the process starts again with the step i.
The simulator follows a 2-steps mechanism: for
each unit of user-time, it takes 2 step-times in the
simulation. Figure 5 shows the actions performed at
step t. With this design, the network delay and
processor delay can be simulated, and the message
arriving is simulated in a more realistic manner. In
addition this design enables several tasks (or events)
being carried out at any time. At any moment in time
a user can request that the simulator generates a
visualization of the current network (fig. 3). The
information encoded in the network snapshot
includes the events (messages traversed) during the
step, the status of the computer nodes (variables
such as the memory) and the network connections at
the end of the step.
3.4 Protocol Level
A special class (peer interface class) is used to
provide an interface for the worker threads in the
network level and to enable the sending of messages
from a computer node to an application node. Any
protocol implementation has to implement this
interface class. To create application nodes, the user
needs to specify the IP address and port number of
the peers created. Upon being created, the
application node/peer can use the registration class
provided by the network level to register itself to a
port of the computer node using the IP address
provided. The implementation of the message object
is important in this simulator. The message object
contains the time-stamp, reference to the message
content object, the origin’s IP address and port
number, and the destination IP address and port
number.
3.5 User Level
A user model contains the method signatures for the
decision-making needed from the protocol level and
is linked to a specific peer instance adding tasks into
the task-scheduler of its peer node. The user model
serves as a load-generator and state-based controller
of the peer nodes allowing for a more accurate
modeling of the behavior of peers.
PEER-TO-PEER NETWORK SIMULATION
89
4 EVALUATION
In its current implementation, the simulator consists
of 29 classes with approximately 3599 lines of code.
There are 10 classes (899 lines of code) for the
platform level, 12 classes (508 lines of code) for the
Gnutella protocol and user task scheduling, and 7
classes (2159 lines of code) for GUI interfaces
including the main method class. Using the Gnutella
0.4 protocol a series of performance and accuracy
tests were conducted to evaluate 3LS. We used
AMD Athlon Processor 800Mhz with 524 MB of
RAM running Microsoft Windows 2000 and a Sun
SunFire3800 with 4 UltraSparc III CPUs running at
750 MHz and 8 GB of RAM running Solaris 8.2.
For each simulation with n nodes, there is a total of
(n-1)+(n-1)*(n-2) number of events (message sent
from a peer to the other).
The test revealed that the simulation of medium-
sized p2p-networks consumes already significant
CPU resources and memory. The memory
consumption increases with the number of events
and is due to the saving of the messages (texts
showing the events occurred in the simulation) in a
hashtable until the simulation is finished upon which
data is saved into a file. To be able to scale up to
higher numbers of simulated peers it is necessary to
distribute the simulation over multiple
processes/host.
5 Future Work
Future work focuses on collecting data for the
various layers e.g. human desktop usage and
network traffic. We are currently testing the
simulator by comparing its results for a Gnutella 0.4
network (Clip2) with the “real data” obtained from
running Gnutella 0.4 clients in a controlled network.
Using Comtella (Vassileva, 2002) clients we are
able to adjust the various parameters of the
simulation and verify the simulation results. Early
results in a small network (less than 20 nodes)
indicate that the simulator works as expected but
more testing is needed.
Figure 4: Example of network view using AiSee
.
ICEIS 2004 - SOFTWARE AGENTS AND INTERNET COMPUTING
90
6 CONCLUSION
This paper provides an overview of existing P2P
protocols, examines the simulators and proposes a
generic P2P simulation model. The simulator
enables the simulation of P2P networks with
different network topology, user models, and
applications. We hope that by starting the
development of a open java-based p2p-network
simulator the community of p2p developers and
researcher will be able to develop models and loads
that enable the evaluation of current and future
protocols.
7 CODE
The complete code of the 3LS simulator is available
upon request by sending an email to one of the
authors. 3LS requires a Java 1.3.1 or higher version
of the JDK.
REFERENCES
AbsInt. AiSee homepage. http://www.aisee.com/
Bellifemine, F., Poggi, A., Rimassa, G. JADE–A FIPA-
compliant agent framework In Proceedings of
PAAM'99,London, April 1999, 97-108.
Castro, M., Druschel, P. Kermarrec A-M., Rowstron A.
SCRIBE: A large-scale and decentralised application-
level multicast infrastructure. IEEE Journal on
Selected Areas in Communications (JSAC) (Special
issue on Network Support for Multicast
Communications), to appear, 2002.
Clip2. The Gnutella Protocol Specification v0.4.
http://rfcgnutella.sourceforge.net/Development/Gnutel
laProtocol0_4-rev1_2.pdf
FreeNet. http://freenet.sourceforge.net.
FreePastry. The FreePastry homepage.
http://www.cs.rice.edu/CS/Systems/Pastry/FreePastry/
Joseph, S.R.H. Adaptive Routing in Distributed
Decentralized Systems: NeuroGrid, Gnutella, and
Freenet. Proceedings of workshop on Infrastructure for
Agents, MAS, and Scalable MAS, at Autonomous
Agents, Montreal, Canada, 2001.
Joseph, S.R,H. Project:NeuroGrid -P2P Bookmark
Organiser:Mailing Lists. NeuroGrid Simulation
Mailing Archive, Jan 2003
http://sourceforge.net/mailarchive/forum.php?thread_i
d=1593250&forum_id=8271
NeuroGrid. The NeuroGrid homepage.
http://www.neurogrid.net/
Pawlikowski, K. Simulation Modeling and Analysis with
an emphasis on applications in performance evaluation
of telecommunication networks. Course 410,
University of Canterbury, August 2002.
http://www.cosc.canterbury.ac.nz/teaching/handouts/c
osc410/02.410.notes1.pdf
Pfeifer, J. Freenet Caching Algorithms Under High Load.
http://www.cs.usask.ca/classes/498/t1/898/W7/P2/free
net.pdf, 2002.
Rowstron, A., Druschel, P. PAST: A large-scale,
persistent peer-to-peer storage utility. HotOS VIII,
Schoss Elmau, Germany, May 2001.
Sandberg, O. The FreeNet-dev mailing list, March 2001
http://www.ultraviolet.org/mail-archives/freenet-
chat.2001/0354.html
Serapis. The Serapis homepage, cvs. Sourceforge.net.
http://cvs.sourceforge.net/cgi-
bin/viewcvs.cgi/freenet/Serapis/
Vassileva, J. Motivating participation in Peer to Peer
Communities. Proceeding of Workshop on Emergent
Societies in the Agent World, ESAW’02, Madrid, 16-
17 September, 2002.
PEER-TO-PEER NETWORK SIMULATION
91