Network Layer
Network Layer
Network Layer
1
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
Network layer
application
transport segment from transport
network
layer network
physical
application
transport
network layer protocols network
data link
physical
network
data link
network
data link
2
Two key network-layer functions
forwarding: move packets analogy:
from router’s input to
appropriate router routing: process of
output planning trip from source
to dest
routing: determine route
taken by packets from forwarding: process of
source to dest. getting through single
interchange
routing algorithms
value in arriving
packet’s header
0111 1
3 2
3
Connection setup
3rd important function in some network
architectures:
ATM, frame relay, X.25
before datagrams flow, two end hosts and
intervening routers establish virtual connection
routers get involved
network vs transport layer connection service:
network: between two hosts (may also involve intervening
routers in case of VCs)
transport: between two processes
4
Network layer service models:
Guarantees ?
Network Service Congestion
Architecture Model Bandwidth Loss Order Timing feedback
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
5
Connection, connection-less service
datagram network provides network-layer
connectionless service
virtual-circuit network provides network-layer
connection service
analogous to TCP/UDP connecton-oriented /
connectionless transport-layer services, but:
service: host-to-host
no choice: network provides one or the other
implementation: in network core
Virtual circuits
“source-to-dest path behaves much like telephone
circuit”
performance-wise
network actions along source-to-dest path
call setup, teardown for each call before data can flow
each packet carries VC identifier (not destination host
address)
every router on source-dest path maintains “state” for
each passing connection
link, router resources (bandwidth, buffers) may be
allocated to VC (dedicated resources = predictable
service)
Network Layer 4-12
6
VC implementation
a VC consists of:
1. path from source to destination
2. VC numbers, one number for each link along path
3. entries in forwarding tables in routers along path
packet belonging to VC carries VC number
(rather than dest address)
VC number can be changed on each link.
new VC number comes from forwarding table
VC forwarding table
12 22 32
1 3
2
VC number
interface
forwarding table in number
northwest router:
Incoming interface Incoming VC # Outgoing interface Outgoing VC #
1 12 3 22
2 63 1 18
3 7 2 17
1 97 3 87
… … … …
7
Virtual circuits: signaling protocols
used to setup, maintain teardown VC
used in ATM, frame-relay, X.25
not used in today’s Internet
application application
transport 5. data flow begins 6. receive data
transport
network 4. call connected 3. accept call network
data link 1. initiate call 2. incoming call data link
physical physical
Datagram networks
no call setup at network layer
routers: no state about end-to-end connections
no network-level concept of “connection”
packets forwarded using destination host address
application application
transport transport
network 1. send datagrams
2. receive datagrams network
data link data link
physical physical
8
Datagram forwarding table
4 billion IP addresses, so
routing algorithm rather than list individual
destination address
local forwarding table
list range of addresses
dest address output link
(aggregate table entries)
address-range 1 3
address-range 2 2
address-range 3 2
address-range 4 1
IP destination address in
arriving packet’s header
1
3 2
otherwise 3
9
Longest prefix matching
longest prefix matching
when looking for forwarding table entry for given
destination address, use longest address prefix that
matches destination address.
otherwise 3
examples:
DA: 11001000 00010111 00010110 10100001 which interface?
DA: 11001000 00010111 00011000 10101010 which interface?
Network Layer 4-19
10
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
forwarding data
plane (hardware)
high-seed
switching
fabric
11
Input port functions
lookup,
link forwarding
line layer switch
termination protocol fabric
(receive)
queueing
physical layer:
bit-level reception
data link layer: decentralized switching:
e.g., Ethernet given datagram dest., lookup output port
see chapter 5 using forwarding table in input port
memory (“match plus action”)
goal: complete input port processing at
‘line speed’
queuing: if datagrams arrive faster than
forwarding rate into switch fabric
Network Layer 4-23
Switching fabrics
transfer packet from input buffer to appropriate
output buffer
switching rate: rate at which packets can be
transfer from inputs to outputs
often measured as multiple of input/output line rate
N inputs: switching rate N times line rate desirable
three types of switching fabrics
memory
12
Switching via memory
first generation routers:
traditional computers with switching under direct control
of CPU
packet copied to system’s memory
speed limited by memory bandwidth (2 bus crossings per
datagram)
input output
port memory port
(e.g., (e.g.,
Ethernet) Ethernet)
system bus
13
Switching via interconnection network
Overcome bus bandwidth limitations
banyan networks, crossbar, other
interconnection nets initially
developed to connect processors in
multiprocessor
advanced design: fragmenting
datagram into fixed length cells, crossbar
switch cells through the fabric.
Cisco 12000: switches 60 Gbps
through the interconnection
network
Output ports
datagram
switch buffer link
fabric layer line
protocol termination
(send)
queueing
14
Output port queueing
switch
switch
fabric
fabric
15
Input port queuing
fabric slower than input ports combined -> queueing may
occur at input queues
queueing delay and loss due to input buffer overflow!
Head-of-the-Line (HOL) blocking: queued datagram at front
of queue prevents others in queue from moving forward
switch switch
fabric fabric
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
16
The Internet network layer
host, router network layer functions:
physical layer
IP datagram format
IP protocol version
number 32 bits total datagram
header length head. type of length (bytes)
(bytes) ver length
len service for
“type” of data fragment fragmentation/
16-bit identifier flgs
offset reassembly
max number time to upper header
remaining hops live layer checksum
(decremented at
each router) 32 bit source IP address
17
IP fragmentation, reassembly
network links have MTU
([Link] size) -
largest possible link-level fragmentation:
frame
…
in: one large datagram
different link types, out: 3 smaller datagrams
different MTUs
large IP datagram divided
(“fragmented”) within net reassembly
one datagram becomes
several datagrams
“reassembled” only at
…
final destination
IP header bits used to
identify, order related
fragments
Network Layer 4-35
IP fragmentation, reassembly
length ID fragflag offset
example: =4000 =x =0 =0
4000 byte datagram
one large datagram becomes
MTU = 1500 bytes several smaller datagrams
18
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
IP addressing: introduction
[Link]
IP address: 32-bit [Link]
identifier for host, router
interface [Link]
[Link] [Link]
interface: connection
between host/router and [Link]
physical link [Link]
[Link]
router’s typically have
multiple interfaces
host typically has one or
[Link] [Link]
two interfaces (e.g., wired
Ethernet, wireless 802.11)
IP addresses associated
with each interface [Link] = 11011111 00000001 00000001 00000001
223 1 1 1
19
IP addressing: introduction
[Link]
Q: how are interfaces [Link]
actually connected?
A: we’ll learn about that [Link]
[Link] [Link]
in chapter 5, 6.
[Link]
[Link]
[Link]
Subnets
IP address: [Link]
subnet part - high order
bits [Link] [Link]
[Link] [Link]
host part - low order
bits [Link]
[Link]
what’s a subnet ? [Link]
20
Subnets
[Link]/24
[Link]/24
recipe [Link]
is called a subnet
[Link]/24
Subnets [Link]
[Link]
[Link] [Link]
[Link] [Link]
[Link] [Link]
[Link] [Link]
21
IP addressing: CIDR
CIDR: Classless InterDomain Routing
subnet portion of address of arbitrary length
address format: a.b.c.d/x, where x is # bits in
subnet portion of address
subnet host
part part
11001000 00010111 00010000 00000000
[Link]/23
22
DHCP: Dynamic Host Configuration Protocol
goal: allow host to dynamically obtain its IP address from network
server when it joins network
can renew its lease on address in use
allows reuse of addresses (only hold address while
connected/“on”)
support for mobile users who want to join network (more
shortly)
DHCP overview:
host broadcasts “DHCP discover” msg [optional]
DHCP server responds with “DHCP offer” msg [optional]
host requests IP address: “DHCP request” msg
DHCP server sends address: “DHCP ack” msg
DHCP
[Link]/24
server
[Link] [Link]
[Link]/24
[Link] [Link]
[Link]/24
23
DHCP client-server scenario
DHCP server: [Link] DHCP discover arriving
client
src : [Link], 68
dest.: [Link],67
yiaddr: [Link]
transaction ID: 654
DHCP offer
src: [Link], 67
dest: [Link], 68
yiaddrr: [Link]
transaction ID: 654
lifetime: 3600 secs
DHCP request
src: [Link], 68
dest:: [Link], 67
yiaddrr: [Link]
transaction ID: 655
lifetime: 3600 secs
DHCP ACK
src: [Link], 67
dest: [Link], 68
yiaddrr: [Link]
transaction ID: 655
lifetime: 3600 secs
24
DHCP: example
DHCP DHCP connecting laptop needs
DHCP UDP its IP address, addr of
DHCP IP
DHCP Eth
first-hop router, addr of
Phy DNS server: use DHCP
DHCP
DHCP request encapsulated
in UDP, encapsulated in IP,
DHCP DHCP [Link] encapsulated in 802.1
DHCP UDP Ethernet
DHCP IP
DHCP Eth router with DHCP
Ethernet frame broadcast
Phy server built into (dest: FFFFFFFFFFFF) on LAN,
router received at router running
DHCP server
Ethernet demuxed to IP
demuxed, UDP demuxed to
DHCP
DHCP: example
DHCP DHCP DCP server formulates
DHCP UDP DHCP ACK containing
DHCP IP client’s IP address, IP
DHCP Eth address of first-hop
Phy router for client, name &
IP address of DNS server
encapsulation of DHCP
DHCP DHCP server, frame forwarded
DHCP UDP to client, demuxing up to
DHCP IP DHCP at client
DHCP Eth router with DHCP
DHCP
Phy server built into client now knows its IP
router address, name and IP
address of DNS server, IP
address of its first-hop
router
25
DHCP: Wireshark Message type: Boot Reply (2)
reply
output (home LAN) Hardware type: Ethernet
Hardware address length: 6
Hops: 0
Transaction ID: 0x6b3a11b7
Seconds elapsed: 0
Message type: Boot Request (1) Bootp flags: 0x0000 (Unicast)
Hardware type: Ethernet Client IP address: [Link] ([Link])
Hardware address length: 6 Your (client) IP address: [Link] ([Link])
Hops: 0
Transaction ID: 0x6b3a11b7
request Next server IP address: [Link] ([Link])
Relay agent IP address: [Link] ([Link])
Seconds elapsed: 0 Client MAC address: Wistron_23:68:8a (00:16:d3:23:68:8a)
Bootp flags: 0x0000 (Unicast) Server host name not given
Client IP address: [Link] ([Link]) Boot file name not given
Your (client) IP address: [Link] ([Link]) Magic cookie: (OK)
Next server IP address: [Link] ([Link]) Option: (t=53,l=1) DHCP Message Type = DHCP ACK
Relay agent IP address: [Link] ([Link]) Option: (t=54,l=4) Server Identifier = [Link]
Client MAC address: Wistron_23:68:8a (00:16:d3:23:68:8a) Option: (t=1,l=4) Subnet Mask = [Link]
Server host name not given Option: (t=3,l=4) Router = [Link]
Boot file name not given Option: (6) Domain Name Server
Magic cookie: (OK) Length: 12; Value: 445747E2445749F244574092;
Option: (t=53,l=1) DHCP Message Type = DHCP Request IP Address: [Link];
Option: (61) Client identifier IP Address: [Link];
Length: 7; Value: 010016D323688A; IP Address: [Link]
Hardware type: Ethernet Option: (t=15,l=20) Domain Name = "[Link]."
Client MAC address: Wistron_23:68:8a (00:16:d3:23:68:8a)
Option: (t=50,l=4) Requested IP Address = [Link]
Option: (t=12,l=5) Host Name = "nomad"
Option: (55) Parameter Request List
Length: 11; Value: 010F03062C2E2F1F21F92B
1 = Subnet Mask; 15 = Domain Name
3 = Router; 6 = Domain Name Server
44 = NetBIOS over TCP/IP Name Server
……
26
Hierarchical addressing: route aggregation
hierarchical addressing allows efficient advertisement of routing
information:
Organization 0
[Link]/23
Organization 1
“Send me anything
[Link]/23 with addresses
Organization 2 beginning
[Link]/23 . Fly-By-Night-ISP [Link]/20”
.
. . Internet
.
Organization 7 .
[Link]/23
“Send me anything
ISPs-R-Us
with addresses
beginning
[Link]/16”
Organization 0
[Link]/23
“Send me anything
with addresses
Organization 2 beginning
[Link]/23 . Fly-By-Night-ISP [Link]/20”
.
. . Internet
.
Organization 7 .
[Link]/23
“Send me anything
ISPs-R-Us
with addresses
Organization 1 beginning [Link]/16
or [Link]/23”
[Link]/23
27
IP addressing: the last word...
[Link]
[Link]
[Link]
[Link]
28
NAT: network address translation
motivation: local network uses just one IP address as far
as outside world is concerned:
range of addresses not needed from ISP: just one
IP address for all devices
can change addresses of devices in local network
without notifying outside world
can change ISP without changing addresses of
devices in local network
devices inside local net not explicitly addressable,
visible by outside world (a security plus)
29
NAT: network address translation
NAT translation table 1: host [Link]
2: NAT router WAN side addr LAN side addr
changes datagram sends datagram to
source addr from [Link], 5001 [Link], 3345 [Link], 80
[Link], 3345 to …… ……
[Link], 5001,
updates table S: [Link], 3345
D: [Link], 80
[Link]
1
S: [Link], 5001
2 D: [Link], 80 [Link]
[Link]
[Link] S: [Link], 80
D: [Link], 3345
4
S: [Link], 80
D: [Link], 5001 3 [Link]
4: NAT router
3: reply arrives changes datagram
dest. address: dest addr from
[Link], 5001 [Link], 5001 to [Link], 3345
30
NAT traversal problem
client wants to connect to
server with address [Link]
server address [Link] local to [Link]
client
LAN (client can’t use it as
destination addr) ?
only one externally visible NATed [Link]
address: [Link]
solution1: statically configure [Link] NAT
NAT to forward incoming router
connection requests at given
port to server
e.g., ([Link], port 2500)
always forwarded to [Link] port
25000
31
NAT traversal problem
solution 3: relaying (used in Skype)
NATed client establishes connection to relay
external client connects to relay
relay bridges packets between to connections
2. connection to
relay initiated 1. connection to [Link]
by client relay initiated
by NATed host
3. relaying
client established
[Link] NAT
router
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
32
ICMP: internet control message protocol
3 probes 3 probes
3 probes
Network Layer 4-66
33
IPv6: motivation
initial motivation: 32-bit address space soon to be
completely allocated.
additional motivation:
header format helps speed processing/forwarding
header changes to facilitate QoS
data
32 bits
Network Layer 4-68
34
Other changes from IPv4
checksum: removed entirely to reduce processing
time at each hop
options: allowed, but outside of header, indicated
by “Next Header” field
ICMPv6: new version of ICMP
additional message types, e.g. “Packet Too Big”
multicast group management functions
IPv6 datagram
IPv4 datagram
Network Layer 4-70
35
Tunneling
A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6 IPv6 IPv6
A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6
Tunneling
A B IPv4 tunnel E F
connecting IPv6 routers
logical view:
IPv6 IPv6 IPv6 IPv6
A B C D E F
physical view:
IPv6 IPv6 IPv4 IPv4 IPv6 IPv6
data data
A-to-B: E-to-F:
IPv6 B-to-C: B-to-C: IPv6
IPv6 inside IPv6 inside
IPv4 IPv4 Network Layer 4-72
36
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
IP destination address in
arriving packet’s header
1
3 2
37
Graph abstraction
5
v 3 w
2 5
u 2 z
1
3
1
x y 2
graph: G = (N,E) 1
N = set of routers = { u, v, w, x, y, z }
E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }
38
Routing algorithm classification
Q: global or decentralized Q: static or dynamic?
information?
static:
global: routes change slowly over
all routers have complete time
topology, link cost info dynamic:
“link state” algorithms routes change more
decentralized: quickly
router knows physically- periodic update
connected neighbors, link in response to link
costs to neighbors cost changes
iterative process of
computation, exchange of
info with neighbors
“distance vector” algorithms
Network Layer 4-77
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
39
A Link-State Routing Algorithm
Dijkstra’s algorithm notation:
net topology, link costs c(x,y): link cost from
known to all nodes node x to y; = ∞ if not
accomplished via “link state direct neighbors
broadcast” D(v): current value of
all nodes have same info cost of path from source
computes least cost paths to dest. v
from one node (‘source”) p(v): predecessor node
to all other nodes along path from source to
gives forwarding table for v
that node N': set of nodes whose
iterative: after k least cost path definitively
iterations, know least cost known
path to k dest.’s
Network Layer 4-79
Dijsktra’s Algorithm
1 Initialization:
2 N' = {u}
3 for all nodes v
4 if v adjacent to u
5 then D(v) = c(u,v)
6 else D(v) = ∞
7
8 Loop
9 find w not in N' such that D(w) is a minimum
10 add w to N'
11 update D(v) for all v adjacent to w and not in N' :
12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */
15 until all nodes in N'
40
Dijkstra’s algorithm: example
D(v) D(w) D(x) D(y) D(z)
Step N' p(v) p(w) p(x) p(y) p(z)
0 u 7,u 3,u 5,u ∞ ∞
1 uw 6,w 5,u 11,w ∞
2 uwx 6,w 11,w 14,x
3 uwxv 10,v 14,x
4 uwxvy 12,y
5 uwxvyz x
9
notes: 5 7
construct shortest path tree by 4
tracing predecessor nodes 8
ties can exist (can be broken u 3 w y z
arbitrarily) 2
3
7 4
v
Network Layer 4-81
v 3 w
2 5
u 2 z
1
3
1
x y 2
1
Network Layer 4-82
41
Dijkstra’s algorithm: example (2)
resulting shortest-path tree from u:
v w
u z
x y
1
A 1+e A A A
2+e 0 0 2+e 2+e 0
D 0 0 B D 1+e 1 B D B D 1+e 1 B
0 0
0 e 0 0
C C 0 1
C 1+e C 0
1 1
e given these costs, given these costs, given these costs,
initially find new routing…. find new routing…. find new routing….
resulting in new costs resulting in new costs resulting in new costs
Network Layer 4-84
42
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
let
dx(y) := cost of least-cost path from x to y
then
dx(y) = min
v
{c(x,v) + dv(y) }
43
Bellman-Ford example
5
clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3
v 3 w
2 5
u 2 z B-F equation says:
1
3
1 du(z) = min { c(u,v) + dv(z),
x y 2
1 c(u,x) + dx(z),
c(u,w) + dw(z) }
= min {2 + 5,
1 + 3,
5 + 3} = 4
node achieving minimum is next
hop in shortest path, used in forwarding table
Network Layer 4-87
44
Distance vector algorithm
key idea:
from time-to-time, each node sends its own
distance vector estimate to neighbors
when x receives new DV estimate from neighbor,
it updates its own DV using B-F equation:
Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N
45
Dx(z) = min{c(x,y) +
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x cost to cost to
table x y z x y z
x 0 2 7 x 0 2 3
from
y ∞∞ ∞ y 2 0 1
from
z ∞∞ ∞ z 7 1 0
node y cost to
table x y z y
2 1
x ∞ ∞ ∞
x z
from
y 2 0 1 7
z ∞∞ ∞
node z cost to
table x y z
x ∞∞ ∞
from
y ∞∞ ∞
z 7 1 0
time
Network Layer 4-91
Dx(z) = min{c(x,y) +
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2 Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
node x cost to cost to cost to
table x y z x y z x y z
x 0 2 7 x 0 2 3 x 0 2 3
from
y ∞∞ ∞ y 2 0 1
from
y 2 0 1
from
z ∞∞ ∞ z 7 1 0 z 3 1 0
node y cost to cost to cost to
table x y z x y z x y z y
2 1
x ∞ ∞ ∞ x 0 2 7 x 0 2 3 x z
from
y 2 0 1 y 2 0 1
from
y 2 0 1 7
from
z ∞∞ ∞ z 7 1 0 z 3 1 0
x ∞∞ ∞ x 0 2 7 x 0 2 3
from
from
y 2 0 1 y 2 0 1
from
y ∞∞ ∞
z 7 1 0 z 3 1 0 z 3 1 0
time
Network Layer 4-92
46
Distance vector: link cost changes
link cost changes:
1
node detects local link cost change y
4 1
updates routing info, recalculates x z
distance vector 50
if DV changes, notify neighbors
t2 : y receives z’s update, updates its distance table. y’s least costs
do not change, so y does not send a message to z.
47
Comparison of LS and DV algorithms
message complexity robustness: what happens if
LS: with n nodes, E links, O(nE) router malfunctions?
msgs sent LS:
DV: exchange between neighbors node can advertise incorrect
only link cost
convergence time varies each node computes only its
own table
speed of convergence DV:
LS: O(n2) algorithm requires
O(nE) msgs DV node can advertise
incorrect path cost
may have oscillations
each node’s table used by
DV: convergence time varies others
may be routing loops • error propagate thru
count-to-infinity problem network
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
48
Hierarchical routing
our routing study thus far - idealization
all routers identical
network “flat”
… not true in practice
Hierarchical routing
aggregate routers into gateway router:
regions, “autonomous at “edge” of its own AS
systems” (AS) has link to router in
routers in same AS another AS
run same routing
protocol
“intra-AS” routing
protocol
routers in different AS
can run different intra-
AS routing protocol
49
Interconnected ASes
3c
3a 2c
3b 2a
AS3 2b
1c AS2
1a 1b AS1
1d forwarding table
configured by both intra-
and inter-AS routing
Intra-AS Inter-AS algorithm
Routing Routing
algorithm algorithm intra-AS sets entries
Forwarding
for internal dests
table inter-AS & intra-AS
sets entries for
external dests
Network Layer 4-99
Inter-AS tasks
suppose router in AS1 AS1 must:
receives datagram 1. learn which dests are
destined outside of AS1: reachable through AS2,
router should forward which through AS3
packet to gateway 2. propagate this
router, but which one? reachability info to all
routers in AS1
job of inter-AS routing!
3c
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
50
Example: setting forwarding table in router 1d
suppose AS1 learns (via inter-AS protocol) that subnet x
reachable via AS3 (gateway 1c), but not via AS2
inter-AS protocol propagates reachability info to all internal
routers
router 1d determines from intra-AS routing info that its
interface I is on the least cost path to 1c
installs forwarding table entry (x,I)
3c
x
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
3c
x
3a
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
?
Network Layer 4-102
51
Example: choosing among multiple ASes
now suppose AS1 learns from inter-AS protocol that subnet
x is reachable from AS3 and from AS2.
to configure forwarding table, router 1d must determine
towards which gateway it should forward packets for dest x
this is also job of inter-AS routing protocol!
hot potato routing: send packet towards closest of two
routers.
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
52
Intra-AS Routing
also known as interior gateway protocols (IGP)
most common intra-AS routing protocols:
RIP: Routing Information Protocol
OSPF: Open Shortest Path First
IGRP: Interior Gateway Routing Protocol
(Cisco proprietary)
53
RIP: example
z
w x y
A D B
C
routing table in router D
destination subnet next router # hops to dest
w A 2
y B 2
z B 7
x -- 1
…. …. ....
Network Layer 4-107
RIP: example
A-to-D advertisement
dest next hops
w - 1
x - 1
z C 4
…. … ... z
w x y
A D B
C
routing table in router D
destination subnet next router # hops to dest
w A 2
y B 2
A 5
z B 7
x -- 1
…. …. ....
Network Layer 4-108
54
RIP: link failure, recovery
if no advertisement heard after 180 sec -->
neighbor/link declared dead
routes via neighbor invalidated
new advertisements sent to neighbors
neighbors in turn send out new advertisements (if tables
changed)
link failure info quickly (?) propagates to entire net
poison reverse used to prevent ping-pong loops (infinite
distance = 16 hops)
transport transprt
(UDP) (UDP)
network forwarding forwarding network
(IP) table table (IP)
link link
physical physical
55
OSPF (Open Shortest Path First)
“open”: publicly available
uses link state algorithm
LS packet dissemination
topology map at each node
route computation using Dijkstra’s algorithm
OSPF advertisement carries one entry per neighbor
advertisements flooded to entire AS
carried in OSPF messages directly over IP (rather than
TCP or UDP
IS-IS routing protocol: nearly identical to OSPF
56
Hierarchical OSPF
boundary router
backbone router
backbone
area
border
routers
area 3
internal
routers
area 1
area 2
Hierarchical OSPF
two-level hierarchy: local area, backbone.
link-state advertisements only in area
each nodes has detailed area topology; only know
direction (shortest path) to nets in other areas.
area border routers: “summarize” distances to nets in
own area, advertise to other Area Border routers.
backbone routers: run OSPF routing limited to
backbone.
boundary routers: connect to other AS’s.
57
Internet inter-AS routing: BGP
BGP (Border Gateway Protocol): the de facto
inter-domain routing protocol
“glue that holds the Internet together”
BGP provides each AS a means to:
eBGP: obtain subnet reachability information from
neighboring ASs.
iBGP: propagate reachability information to all AS-
internal routers.
determine “good” routes to other networks based on
reachability information and policy.
allows subnet to advertise its existence to rest of
Internet: “I am here”
Network Layer 4-115
BGP basics
BGP session: two BGP routers (“peers”) exchange BGP
messages:
advertising paths to different destination network prefixes (“path vector”
protocol)
exchanged over semi-permanent TCP connections
3c
BGP
3a message
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
58
BGP basics: distributing path information
using eBGP session between 3a and 1c, AS3 sends prefix
reachability info to AS1.
1c can then use iBGP do distribute new prefix info to all routers
in AS1
1b can then re-advertise new reachability info to AS2 over 1b-to-
2a eBGP session
when router learns of new prefix, it creates entry for
prefix in its forwarding table.
eBGP session
3a iBGP session
3b
AS3 2c other
1c 2a networks
other 1a 2b
networks 1b AS2
AS1 1d
59
BGP route selection
router may learn about more than 1 route to
destination AS, selects route based on:
1. local preference value attribute: policy decision
2. shortest AS-PATH
3. closest NEXT-HOP router: hot potato routing
4. additional criteria
BGP messages
BGP messages exchanged between peers over TCP
connection
BGP messages:
OPEN: opens TCP connection to peer and authenticates
sender
UPDATE: advertises new path (or withdraws old)
KEEPALIVE: keeps connection alive in absence of
UPDATES; also ACKs OPEN request
NOTIFICATION: reports errors in previous msg; also
used to close connection
60
BGP routing policy
legend: provider
B network
X
W A
customer
C network:
A advertises path AW to B
B advertises path BAW to X
Should B advertise path BAW to C?
No way! B gets no “revenue” for routing CBAW since neither W nor
C are B’s customers
B wants to force C to route to w via A
B wants to route only to/from its customers!
61
Why different Intra-, Inter-AS routing ?
policy:
inter-AS: admin wants control over how its traffic
routed, who routes through its net.
intra-AS: single admin, so no policy decisions needed
scale:
hierarchical routing saves table size, reduced update
traffic
performance:
intra-AS: can focus on performance
inter-AS: policy may dominate over performance
Chapter 4: outline
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state
datagram networks distance vector
4.3 what’s inside a router hierarchical routing
4.4 IP: Internet Protocol 4.6 routing in the Internet
datagram format RIP
IPv4 addressing OSPF
ICMP BGP
IPv6 4.7 broadcast and multicast
routing
62
Broadcast routing
deliver packets from source to all other nodes
source duplication is inefficient:
duplicate
duplicate R1 creation/transmission R1
duplicate
R2 R2
R3 R4 R3 R4
source in-network
duplication duplication
In-network duplication
flooding: when node receives broadcast packet,
sends copy to all neighbors
problems: cycles & broadcast storm
controlled flooding: node only broadcasts pkt if it
hasn’t broadcast same packet before
node keeps track of packet ids already broadacsted
or reverse path forwarding (RPF): only forward packet
if it arrived on shortest path between node and source
spanning tree:
no redundant packets received by any node
63
Spanning tree
first construct a spanning tree
nodes then forward/make copies only along
spanning tree
A A
B B
c c
D D
F E F E
G G
(a) broadcast initiated at A (b) broadcast initiated at D
A A
3
B B
c c
4
2
D D
F E F E
1 5
G G
(a) stepwise construction of (b) constructed spanning
spanning tree (center: E) tree
Network Layer 4-128
64
Multicast routing: problem statement
goal: find a tree (or trees) connecting routers having
local mcast group members legend
tree: not all paths between routers used group
shared-tree: same tree used by all group members member
not group
source-based: different tree from each sender to rcvrs member
router
with a
group
member
router
without
group
member
65
Shortest path tree
mcast forwarding tree: tree of shortest path
routes from source to all receivers
Dijkstra’s algorithm
s: source LEGEND
R1 2 router with attached
1 R4
group member
R2 5 router with no attached
3 4 group member
R5
i link used for forwarding,
R3 6
i indicates order link
R6 R7 added by algorithm
66
Reverse path forwarding: example
s: source LEGEND
R1
R4 router with attached
group member
R2
router with no attached
R5 group member
R3 datagram will be forwarded
R6 R7
datagram will not be
forwarded
67
Shared-tree: steiner tree
Center-based trees
single delivery tree shared by all
one router identified as “center” of tree
to join:
edge router sends unicast join-msg addressed to center
router
join-msg “processed” by intermediate routers and
forwarded towards center
join-msg either hits existing tree branch for this center,
or arrives at center
path taken by join-msg becomes new branch of tree for
this router
68
Center-based trees: example
suppose R6 chosen as center:
LEGEND
69
DVMRP: continued…
soft state: DVMRP router periodically (1 min.)
“forgets” branches are pruned:
mcast data again flows down unpruned branch
downstream router: reprune or else continue to receive
data
routers can quickly regraft to tree
following IGMP join at leaf
odds and ends
commonly implemented in commercial router
Tunneling
Q: how to connect “islands” of multicast routers in a
“sea” of unicast routers?
70
PIM: Protocol Independent Multicast
not dependent on any specific underlying unicast
routing algorithm (works with all)
two different multicast distribution scenarios :
dense: sparse:
group members densely # networks with group
packed, in “close” members small wrt #
proximity. interconnected networks
bandwidth more plentiful group members “widely
dispersed”
bandwidth not plentiful
71
PIM- dense mode
flood-and-prune RPF: similar to DVMRP but…
underlying unicast protocol provides RPF info
for incoming datagram
less complicated (less efficient) downstream
flood than DVMRP reduces reliance on
underlying routing algorithm
has protocol mechanism for router to detect it
is a leaf-node router
72
PIM - sparse mode
sender(s):
R1
unicast data to RP, R4
join
which distributes
down RP-rooted tree R2
join
RP can extend mcast R5
tree upstream to R3
join
source R6
RP can send stop msg all data multicast R7
rendezvous
if no attached from rendezvous
point
point
receivers
“no one is listening!”
Chapter 4: done!
4.1 introduction 4.5 routing algorithms
4.2 virtual circuit and link state, distance vector,
datagram networks hierarchical routing
4.3 what’s inside a router 4.6 routing in the Internet
RIP, OSPF, BGP
4.4 IP: Internet Protocol
datagram format, IPv4
4.7 broadcast and multicast
addressing, ICMP, IPv6 routing
73