Enhancing Gnutella Scalability Through Heterogeneity
Enhancing Gnutella Scalability Through Heterogeneity
most content is typically replicated at a fair frac- Improvements to Gnutella: Gnutella uses TTL-
tion of participating sites, and limited flooding to distribute its queries. Lv et al.
[8] investigate several alternative query distribution
the node population is highly transient methods on a range of overlay topologies: random
graphs, power-law random graphs,5 grids and a mea-
The first condition is that one often doesn’t know the sured Gnutella topology. Based on their results, they
exact file identifier, or is looking for a set of files all propose the use of multiple parallel random walks in-
matching a given attribute (e.g., all by the same artist). stead of flooding. To search for an object using a ran-
The second is that one isn’t typically searching for ex- dom walk, a node chooses a neighbor at random and
tremely rare files that are only stored at a few nodes. send the query to it. Each neighbor in turn repeats this
This would apply to the music-sharing that dominates process until the object is found. Random walks avoid
today’s file-sharing systems. The third condition seems the problem of the exponentially increasing number of
to apply to currently popular P2P systems, although it query messages that arise in flooding.
may not apply to smaller community-based systems. Cohen et al. [3] study proactive replication algo-
File-sharing is one (widely deployed) example that rithms.6 They find that the optimal replication scheme
fits these criteria, distributed search engines [4] might for random search is to replicate objects proportional
well be another. Unstructured P2P systems may be a to the square-root of their query rate; this replication
suitable choice for these applications because of their scheme can be implemented in a distributed fashion by
overall simplicity and their ability to tolerate tran- replicating objects a number of times proportional to
sient user populations and comfortably support key- the length of the search.
word searches.4 This all depends, of course, on whether These combined approaches, proactive replication
or not such systems can be made scalable, and that is plus parallel random walking, was tested via simula-
the question we address in this short paper. tion in [8] and was found to significantly improve –
Our approach to improving the scalability of these by two orders of magnitude in some cases – the per-
systems is based on recent work which shows the preva- formance of flooding and passive replication, as mea-
lence of heterogeneity in deployed P2P systems and on sured by the query resolution time (in terms of number
work improving the efficiency of search in unstructured of overlay hops), the per-node query load and the mes-
networks. After reviewing this relevant background in sage traffic generated by individual queries. In the case
Section 2, we describe, in Section 3, a simple flow con- of power-law random graphs it was observed that high
trol algorithm that takes advantage of heterogeneity and degree nodes experienced correspondingly high query
evaluate its performance in Section 4. We offer this loads. For this reason, the authors conclude that P2P
algorithm not as a polished design but as a proof-of- network construction algorithms should avoid generat-
concept that heterogeneity can be used to improve the ing very high degree nodes.
scalability of unstructured P2P systems. Similarly, one Adamic et al.[1] approach the problem from a very
should take this position paper not as a “proof” that different viewpoint. They take power-law random
Gnutella can be made scalable, but as a conjecture in 5
Power-law random graphs are graphs where the degree distri-
search of more evidence.
bution is a power-law and the nodes are connected randomly con-
sistent with the degree distribution. A node’s “degree” is its number
4
We should add that unstructured P2P systems, whether scalable of neighbors.
6
or not, will never be able match the exact-matching performance of With proactive replication, an object may be replicated at a
highly structured P2P systems. We expect that these highly struc- node even though that node has not requested the object. Passive
tured P2P systems will eventually find many uses; we just aren’t replication, where nodes hold copies only if they’ve requested the
convinced that popular-music file-sharing will be one of them. object, is the standard approach in Gnutella-like P2P systems.
graphs as a given and instead ask how to best search nodes, and thus are more likely to find the desired files.
on them. They, too, suggest the use of random walks, Our design results in a quasi-hierarchical organization
but that these random walks should be biased to seek of nodes with high-capacity nodes at the higher levels
out high-degree nodes. They show how this leads to in the hierarchy. This “hierarchy” is not, however, ex-
superior scaling behavior in the limit of large systems. plicit; instead it is achieved through a distributed, adap-
However, their work does not take into account the tive and lightweight self-organization algorithm.
query load on individual nodes; as shown by [8] the
high-degree nodes carry an extremely large share of the Flow control and Topology Adaption: In what fol-
query traffic and are likely to be overloaded. lows, we represent the capacity of node by . For
the purpose of this paper, we assume that denotes the
Heterogeneity: Saroiu et al., in [11], report on the re-
maximum number of messages node is willing/able
sults of a measurement study of Napster and Gnutella.
to process over a given time interval . A node is
Their study reveals (amongst other things) a significant
connected, at the application-level, to a set of neighbor
amount of heterogeneity in bandwidth across peers.
nodes, denoted
. For each of
a node
The authors of [11] argue that architectures for P2P
systems should take into account the characteristics of maintains the following information:
participating hosts, including their heterogeneity. How-
: the number of incoming messages from
ever, since the measurement study was the main focus node to in the last time interval . Every
of [11], they did not propose any design strategies.
node reports its total incoming rate (
"!$#$%&
' *
These two lines of work, the observed heterogene- )( ) to all its neighbors.
ity and the improvements to Gnutella-like P2P systems, ,+.-0/ 1 : the number of outgoing messages from
lead us to the following observation. Using random- node to in the last time interval
walk searches on power-law random graphs is an effi- ,+.-0/325476 1 : the maximum number of messages
cient search technique, but results in a very skewed us-
node can send to node per time interval .
:9; +.-8/<25476 1 and
At all times, +.-8/ 1
age pattern with some nodes carrying too large a share
of the load. Measurement studies reveal, however, that +.-0/325476 1 =
;
9 +.-8/ 1
> @? A
1 . 8
node capabilities (particularly in terms of bandwidth,
but presumably in terms of other resources such as When a new node, say , joins the network, it is con-
CPU, memory and disk) are also skewed. If we could nected to a random set of neighbor nodes. For each
align these two skews – that is, have the highest capac- neighbor node , initializes:
ity nodes carry the heaviest burdens – we might end up ,+.-0/325476 1
*B$C7
with a scalable P2P design. ED
and
This is what we attempt to do in the next section. ,+.-0/ 1 FED
+
redirect to ; // i.e. disconnects from
T VU W XUEY[Z#\
Our simulations in Section 4 use the algorithm from
"D We stress that these results are not intended to provide a
and
]U J^
Figure 1 with parameter values of
"DD 9
,
. We defer an exploration of the pa-
, comprehensive picture of the behavior of our algorithm
but merely to serve as a first-order proof of concept.
rameter space to later work. While our initial simula- 10
With state-keeping, a search is given a unique identifier. A
tion results appear promising, we expect to incorporate
node then remembers the neighbors to which it has already for-
extensions and modifications to the above rules as we warded queries for a given identifier, and if a query with the same
continue to study the behavior of our algorithm. For identifier arrives back at the node, it is forwarded to a different
_ `
neighbor. Using state-keeping thus reduces the likelihood that a
9
In practice, and need not be fixed values. A node can quite random walk will traverse the same path twice. State can be dis-
easily communicate more precise increase and decrease parameters carded after a short time by a node, so it does not become burden-
to its neighboring nodes. some.
Chang of avg #hops: alpha = 2.0, avg replic ratio = 1% Cumulative Hop Distribution: alpha = 2.0, avg replic ratio = 1% Degree Distribution: alpha = 2.0, avg replic ratio = 1%
70 1000
10000-node graph 10000-node random graph
50 80
100
node degree
avg #hops
40 60
30
40
10
20
20
10
last 12000 queries
0 0 1
0 5 10 15 20 25 30 35 40 45 50 1 2 4 8 16 32 64 128 256 512 1 10 100 1000 10000
#queries performed (x 12000) #hops node rank
Figure 2: Degree distribution, average query resolution time and the distribution of query resolution times for a
10,000 node simulation using a Zipf-like capacity distribution
Simulation methodology: We use, as our main eval- Capacity level Percentage of nodes
uation metric, the average query resolution time (as x 20%
10x 45%
measured in terms of the number of application-level 100x 30%
hops required to find a particular object). We measure 1000x 4.9%
this in simulations in which the object popularity, the 10000x 0.1%
rate at which queries are issued for it, follows a Zipf-
like distribution where the popularity of the ’th most Table 1: Gnutella-like node capacity distributions
popular object is proportional to
lations we used
XUEY[Z
. In these simu-
(based on the Gnutella mea- in a uniform random graph topology; for the Zipf-like
surements in [12]). Individual nodes generate queries capacity distribution we use a 10,000 node graph with
using a Poisson process with an average query rate of average degree 9.9, for the Gnutella-like distribution
1.2 queries/minute. Recall, as described above, that we capacity distribution we use a a 5,000 node graph with
use a proportional replication strategy so that objects average degree 7.5. We then assign the capacities and
are replicated proportional to their query rate. The av- objects to the nodes as described above.
erage degree of replication is 1%11 – that is, on average
Results: Figures 2 and 3 plot the results of simu-
objects are replicated on 1% of the nodes, but the more
lations using, respectively, the Zipf and Gnutella-like
popular objects are replicated more than the less popu-
capacity distribution. The first plots show how, as
lar ones. Recall also that objects are assigned randomly
to nodes, proportional to their capacity .
the topology adjusts, the average query response time
D
rapidly decreases from to in the Zipf case and
Z
To investigate the effect of heterogeneous node ca-
50 80
node degree
avg #hops
40 60
10
30
40
20
20
10
last 10000 queries
0 0 1
0 5 10 15 20 25 30 35 40 45 50 1 2 4 8 16 32 64 128 256 512 1 2 level 3 level 4 level 5
#queries performed (x 12000) #hops node resource level
Figure 3: Degree distribution, average query resolution time and the distribution of query resolution times for a
5,000 node simulation using a Gnutella-like capacity distribution
[10] ROWSTRON , A., AND D RUSCHEL , P. Storage management and caching in PAST, a large-
scale, persistent peer-to-peer storage utility. In Proceedings of the Eighteenth SOSP (2001),
6 Discussion ACM.
We started this paper with the basic question of whether [11] S AROIU , S., G UMMADI , K., AND G RIBBLE , S. A measurement study of peer-to-peer file
sharing systems. In Proceedings of Multimedia Conferencing and Networking (San Jose, Jan.
one could make unstructured P2P systems scalable. 2002).
Building on the work in [8, 3, 1, 11] we proposed a de- [12] S RIPANIDKULCHAI , K. The popularity of gnutella queries and its implications on scalability.
In O’Reilly’s [Link] (Feb. 2001).
sign that appears to achieve significant scalability. This
[13] S TOICA , I., M ORRIS , R., K ARGER , D., K AASHOEK , M. F., AND BALAKRISHNAN , H.
design is extremely preliminary, and and our evaluation Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of
SIGCOMM 2001 (Aug. 2001).
leaves many questions unanswered. We offer it how-
ever, merely as support for the conjecture that unstruc- [14] W ITTEN , I. H., M OFFAT, A., AND B ELL , T. C. Managing Gigabytes: Compressing and
Indexing Documents and Images, second ed. Morgan Kaufmann, 1999.
tured P2P systems can be significantly improved, per-
[15] Z HAO , B., K UBIATOWICZ , J., AND J OSEPH , A. Tapestry: An infrastructure for fault-tolerant
haps to the point where their scalability is no longer a wide-area location and routing. UCB Technical Report, 2001.
barrier.
Our design also raised the more philosophical ques-
tion of how to deal with heterogeneity. Most of the
highly structured P2P designs start with the assumption
of a homogeneous node population and then alter their
designs to accommodate heterogeneity. The design we
present here actively exploits heterogeneity. One can
ask whether the highly structured designs could also be
modified to exploit heterogeneity.