Peer-to-Peer Computing: DHTs and Overlay
Networks
Distributed Systems
Ahmed Ali-Eldin
One of the best places to read about systems
[Link]
What is P2P?
● It is (used to be) every ISP's nightmare
– In 2007, estimations between 50% to 70% of
Internet traffic is P2P [1,2,3].
– Look at the nice picture from 2002 [4].
What is P2P again?
●“A distributed network architecture may be called
a Peer-to-Peer (P-to-P, P2P) network, if the
participants share a part of their own hardware
resources (processing power, storage
capacity,network link capacity, printers).”
●“These shared resources are necessary to
provide the Service and content offered by the
network (e.g. file sharing or shared workspaces
for collaboration).
“They are accessible by other peers directly,
●
without passing intermediary entities. The
Why do they use P2P?
●Go beyond services offered by client-server
systems by having symmetry in roles where a
client may also be a server.
●Unlike Grid systems, P2P systems do not arise
from the collaboration between established and
connected groups of systems and without a more
reliable set of resources to share.
– Grid was an idea that computing should be
like the power grid
What is a P2P Overlay?
Distributed systems, without any hierarchical
●
organization or centralized control.
Peers form self-organizing overlay networks that
●
are overlayed on the Internet Protocol (IP)
networks.
– They are virtual networks
Today’s lecture
● Unstructured Overlays
– Napster
– Gnutella
– BitTorrent
● Structured Overlays
Unstructured overlays
History: Napster
Figure Courtesy of Sameh El-Ansary's and Seif Haridi's course at SICS/KTH
History: Basic operations in Napster
● Join
– Connect to the Napster name server
● Leave/Fail
– Server detects you are not there any more
– Data removed from directory
● Share (Publish/Insert)
– Hi server, I have these files I want to share
● Search
– Hi server, does anyone online have these
History: The end
History: End of Napster
Napster was brought down when Metallica and
●
Dr. Dre filed a lawsuit for copyright infringement.
● Its name was sold at a bankruptcy auction
– [Link]
Reflections on Napster
● Why could it be sued?
● Can we build something that is un-sueable?
Rise of the Unstructured overlays
History:Random (Unstructured)
Overlay Networks Distributed
Directory + Distributed Storage
12
10
Column 1
6
Column 2
Column 3
0
Row 1 Row 2 Row 3 Row 4
Figure Courtesy of Sameh El-Ansary's and Seif Haridi's course at SICS/KTH
History: Gnutella
●This is still a functional network with updates
rolled out as we speak
●Peers join network by connecting to one of the
(always there) peers
– Usedto be available from a link that is now a
porn web page
History: Gnutella messages
● Group Membership
– Ping: “Hi I am here, add me to the network”
● Broadcasted to neighbors, neighbors
forward the message
● Pong: “Cool, my IP-address is x.x.x.x, etc..
– From all peers who get the Ping
History: Gnutella messages
● Search
– Query: “Does any one have 'foo.mp3'?”
● Contains a Time-to-Live (TTL)
● Floods the network until TTL
– Queryresponse: “ I do, you can get the file
through...” or no response
● “Maybe it is just the TTL, let me increase it
and try again”
History: Gnutella messages
● File Transfer
– GET and PUSH: is the actual protocol used for
downloading the files
– GET when no peer is behind a firewall
– PUSH when the file owner is behind a firewall
(or similar)
Example for Gnutella (Historical)
Where is A?
A
History: Gnutella
a peer periodically PINGs its neighbors to
●
discover other participating peers.
– And that was problematic
History: Gnutella
a peer periodically PINGs its neighbors to
●
discover other participating peers.
– And that was problematic
● Flooding
– Too much traffic
– Impossible
to break the network without taking
down everyone (or at least the always on
peers)
– Ortargeting the people who originally wrote
the software
Design question
● How to reduce flooding?
Design question
● How to reduce flooding?
Bittorrent: Content distribution with a
twist
●Large number of peers want to download a
certain file as fast as possible.
● All previous techniques discussed do not do that
– Has to wait for a peer already downloading the
file
– Then another peer can start retrieving the file
from one of the two peers
Bittorent
● Philosophy:
– Use the bandwidth of everyone in the network
to the maximum
Bittorrent: Layman's example
● One file at Peer1 of size 10 GB on a 1 Mb/s link.
● 100 peers who want that file at the same time
● A few options to distribute the data
Layman's example: Option 1
● Send it to Peer2 in 21 hours (at least)
● Then send it to Peer3 in another 21 hours.
To distribute the file to all peers it will take 2100
●
hours (actually more), i.e. roughly 3 months.
Even with all peers acting like sources once they
●
get the file, you need 6*21 hours=126 hours,
almost 5.25 days. (How did I get this number?)
Layman's example: Option 2
● Divide the file in to 100 distinct chunks
●Send to each peer out of the 100 peers one
chunk of the file
– Try
to be fair with bandwidth allocated to each
peer
●Each peer now has a small part of the file after
the first 21 hours.
– But the whole file is in the network
– Peers can then fetch from each other the parts
they are missing
In reality
“Piece length maps to the number of bytes in
●
each piece the file is split into.”
●“For the purposes of transfer, files are split into
fixed-size pieces which are all the same length
except for possibly the last one which may be
truncated.”
●“piece length is almost always a power of two,
most commonly 2^18 = 256 K (BitTorrent prior to
version 3.2 uses 2 20 = 1 M as default).”
●
Layman's example: Option 2
● That means even faster downloads
– Smaller chunks will be sent in a few seconds.
– Then
each peer in the network will have
something to share.
– Making the optimal transfer length just a little
over 21 hours.
In reality
●You almost never have 100 peers asking for the
same file at the exact same instance
Some will start the download earlier than the
●
others
●The early downloaders will, with high probability,
eat up the upload bandwidth of the file source
since they establish the connection ahead
– Andeach uploader/downloader can set the
maximum number of simultaneous
connections
Bittorent in action
Courtesy of Sameh El-Ansary
Seeders
● Machines that have a complete copy of the file
They are usually selfish and do not want to wait
●
after they get the file
●At least the first seed has to stay to serve one
complete copy of the file
In general at all times all pieces of a file must be
●
around or the process fails
Leechers
● Those who do not have a full copy of the file.
●Once they have the full file, they become
seeders.
A swarm is the set of peers
that are participating in distributing the same file
Tracker (Membership server)
●A peer that keeps track of the seeds and the
leechers
●Connected to when you download a .torrent file,
which contains information about the file, its
length, name, and hashing information, and URL
of a tracker
● It is a simple protocol layered on top of HTTP
– Downloader sends information about the file it
is downloading and the port number.
– Tracker responds with a random list of
contact information about the peers that are
Torrent File
Courtesy of Sameh El-Ansary
Handshaking and exchanging
Bitfields
●Send handshake to all
peers received from the
tracker
● Exchange bitfields
Choking algorithm: One sided
relationship
Choking algorithm: Mutual love
Choked=false Choked=false
Interested=true Interested=true
A B
Tit for Tat
Chocking algorithm: Philanthropy
Choked=false Choked=false
Interested=false Interested=true
A B
A is typically a seeder in this case
Which piece to get first
● Rarest-first
– Choose the piece that is rarest among your
peers
– If possible
● Random-first piece
– As we already know, the first piece is crucial
for someone to be useful to the community
– So, rarest-first is not applied
– Getfirst piece as fast as possible, even if you
download subpieces from different peers
BitTorrent legal uses
[Link]
●
bittorrent-youd-be-surprised/
Structured Overlays
Premise
The ID of a data item determines the machine on
which it is going to be stored
Simplest Example: Consistent Hashing
14 12 2
[Link] [Link] [Link] [Link]
12 3 7 0
- 4 machines that need to share data
- Each machine will have an id based on the hash of its IP address
- Data will also have ids using the same function (SHA, MD5sum..)
Simple Example: Build a ring
● Original structure
Simple Example: update the ring
● Policy
– A doc with id y,
would be stored at
Succ(y)
● Lookup
– ask node n which
you know about to
find the successor of
id.
Handling Joins
●You need to know the
successor
●Your predecessor
needs to know about
you
New example: Handling Joins
● A node 9 joins the ring
Handling Joins: data redistribution
●Can you think what
happens on data
leaves?
Chord DHT
● Developed at MIT
● Very popular since it is very simple to understand
In addition to the successor pointer, every node
●
has a predecessor pointer as well
Chord Join
9 can join through any
●
other node, take 0 for
example.
9 will set its
●
predecessor to nil
●0 will help 9 to find its
successor
●0 will tell 9 that its
successor should be 12
Chord Join
12
10
Column 1
6
Column 2
Column 3
0
Row 1 Row 2 Row 3 Row 4
- 12 changes its predecessor to 9
- 7 will learn that 9 is it successor when it runs stablize
-12 will tell
Can their be collisions in the peer
names?
●Can the hash function produce two peers having
the same hash ID?
– H(a) = H(b), and a ≠ b?
– Yes?
– No?
Can their be collisions in the peer
names?
●Can the hash function produce two peers having
the same hash ID?
– No, if you use a good function
● Called Provably secure hash functions
– MD5 and SHA-I are not Provably secure hash
functions
– Still
we use non provably secure hash
functions and seldom use the provably secure
ones
● Can you think why?
Can their be collisions in the peer
names?
●Can the hash function produce two peers having
the same hash ID?
– No, if you use a good function
● Called Provably secure hash functions
– MD5 and SHA-I are not Provably secure hash
functions
– Still
we use non provably secure hash
functions and seldom use the provably secure
ones
● Can you think why?
Routing: Fingers table
Keep a routing table of
●
M peers at each node
– Where N=2M
– Maximum distance
to any other node is
then log2(N), i.e., M
hops
Routing: Fingers table
Chord: Routing
In fact, if nodes are uniformly distributed, the
●
maximum is log (# of nodes), i.e. log (8) hops
between any two nodes
● The average complexity is: 1⁄2 log(#nodes)
Chord: Handling failures
● If one successor fails, total ring collapses (Why?)
– Each node maintains a successor list of length
r
– Ifa node’s immediate successor does not
respond, it uses the second entry in its
successor list
● Update the successors' list
Chord Algorithm
Chord Algorithm
Chord Algorithm
Practicalities
●If we were storing files, replication can be costly,
also access time might be high, however, it might
be needed in certain applications like distributed
files systems
●For file sharing and other application it would be
more suitable to store addresses of files rather
than files
– Stale addresses could be a problem.
– One solution is to let them values expire and
the publisher refreshes them periodically.
Would not it be nice to have
something with locality?
●A huge issue with chord is its inability to take
locality in account
Pastry
•From Microsoft Research
•Aims to achieving
logarithmic diameter with •Was used later as a basis
a logarithmic node state for the Bamboo DHT
•Targets the issue of which inspired Amazon’s
locality: Dynamo, the backbone of
–2 machines in one country, the S3 and EC2
could communicate through a infrastructure
machine in another continent
because the hash of their ids
will be far apart in the id
space.
Pastry – Mapping items onto nodes
•An item in Pastry is
stored at the node that is
numerically closest to the
id of the item
•Naturally, such a node •Can we create an
will have the longest example together?
matching prefix.
•More here:
[Link]
g/
Why did research on P2P computing
almost die?
● Major drawbacks with P2P systems [7]
– Churn and performance guarantees
– The power of cloud computing
● Skype abandoned their P2P architecture
– Decreasing cost of content delivery
– Mostof the low-hanging fruits have been
harvested
● To actually push the field you need to put so
much effort
– You need to be a superb mathematician,
Security
We have mostly talked about robustness and
●
efficiency
● P2P adds a new dimension to security
– With newer attacks and newer challenges
● Example:
– Sybil attacks
– File Poisoning
– Rational Attacks
● Plus the good old ones like DdoS, man-in-the-
Security: Sybil attacks
●Adversary introduces a large number of peers
that he controls to the P2P system
– Subverting the reputation system
– Corrupting the network
– Tricking the users
– Controlling/influencing the network
“One can have, some claim, as many electronic
personas as one has time and energy to
create.”
Security: Sybil attacks
●“without a logically centralized authority, Sybil
attacks are always possible except under extreme
and unrealistic assumptions of resource parity and
coordination among entities.” [6]
Example: Amazon's Dynamo
Huge parts of Amazon's infrastructure depend on
●
Dynamo
●It is a system that integrates a wide range of
concepts you will learn in the distributed systems
course
– Consistent hashing
– Versioning
– Vector clocks
– Quorum
– And a few more
Where does P2P live today
● Internal within companies and datacenters
– Dynamo at Amazon
● Blockchains
The next big thing and the killer of P2P computing
research
Cloud Computing
Some extra extra extra readings :)
[1] [Link]
●
[2][Link]
●
[3] [Link]
●
[4] [Link]
●
[5] [Link]
●
[6] [Link]
●
[7] Li, Baochun, Yuan Feng, and Bo Li. "Rise and fall of the peer-to-peer empire." Tsinghua Science
●
and Technology 17.1 (2012): 1-
[Link]://[Link]/viewdoc/download;jsessionid=A744210E174E5775327F6B8F844D91
E8?doi=[Link].1725&rep=rep1&type=pdf
[8] [Link]
●