Interdomain Routing and The
Border Gateway Protocol (BGP)
Courtesy of Timothy G. Griffin
Intel Research,
Cambridge UK
[Link]@[Link]
How do you connect to the
Internet?
Physical connectivity is
just the beginning of the
story….
Partial View of [Link]
([Link]) Neighborhood
AS 3356 AS 6461 AS 1239
AS 702
Level 3 AboveNet Sprint
UUNET
AS 20965
AS 786 GEANT
AS 5459 [Link]
LINX AS 1213
(UKERNA)
HEAnet
Originates > 180 prefixes, (Irish academic
Including [Link]/16 and research)
AS 4373
AS 7 Online Computer
UK Defense Library Center
Research Agency
Architecture of Dynamic Routing
IGP
EGP (= BGP)
AS 1
IGP = Interior Gateway Protocol
IGP
Metric based: OSPF, IS-IS, RIP,
EIGRP (cisco) AS 2
EGP = Exterior Gateway Protocol
Policy based: BGP
The Routing Domain of BGP is the entire Internet
Technology of Distributed Routing
Link State Vectoring
• Topology information is • Each router knows little
flooded within the routing about network topology
domain • Only best next-hops are
• Best end-to-end paths are chosen by each router for
computed locally at each each destination network.
router. • Best end-to-end paths
• Best end-to-end paths result from composition
determine next-hops. of all next-hop choices
• Based on minimizing • Does not require any
some notion of distance notion of distance
• Works only if policy is • Does not require uniform
shared and uniform policies at all routers
• Examples: OSPF, IS-IS • Examples: RIP, BGP
The Gang of Four
Link State Vectoring
OSPF
IGP RIP
IS-IS
EGP BGP
AS Numbers (ASNs)
ASNs are 16 bit values.
64512 through 65535 are “private”
• Genuity: 1
• MIT: 3
• JANET: 786
• UC San Diego: 7377
• AT&T: 7018, 6341, 5074, …
• UUNET: 701, 702, 284, 12199, …
• Sprint: 1239, 1240, 6211, 6242, …
• …
ASNs represent units of routing policy
BGP Routing Tables
show ip bgp
BGP table version is 111849680, local router ID is [Link]
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
. . .
*>i192.35.25.0 [Link] 50 0 16779 1 701 703 i
*>i192.35.29.0 [Link] 50 0 5727 7018 14541 i
*>i192.35.35.0 [Link] 50 0 16779 1 701 1744 i
*>i192.35.37.0 [Link] 50 0 16779 1 3561 i
*>i192.35.39.0 [Link] 50 0 16779 1 701 80 i
*>i192.35.44.0 [Link] 50 0 5727 7018 1785 i
*>i192.35.48.0 [Link] 55 0 16779 209 7843 225 225 225 225 225 i
*>i192.35.49.0 [Link] 55 0 16779 209 7843 225 225 225 225 225 i
*>i192.35.50.0 [Link] 55 0 16779 3549 714 714 714 i
*>i192.35.51.0/25 [Link] 55 0 16779 3549 14744 14744 14744 14744 14744 14744 14744 14744 i
. . .
Thanks to Geoff Huston. [Link] on July 6, 2001
• Use “whois” queries to associate an ASN with “owner” (for
example, [Link]
• 7018 = AT&T Worldnet, 701 =Uunet, 3561 = Cable &
Wireless, …
AS Graphs Can Be Fun
The subgraph showing all ASes that have more than 100 neighbors in full
graph of 11,158 nodes. July 6, 2001. Point of view: AT&T route-server
AS Graphs Do Not Show “Topology”!
BGP was designed to
throw away information!
The AS graph
may look like this. Reality may be closer to this…
How Many ASNs are there today?
15,981
Thanks to Geoff Huston. [Link] on October 24, 2003
How Many ASNs are there today?
18,217
Thanks to Geoff Huston. [Link] on October 26, 2004
How many prefixes today?
154,894
Note: numbers
actually depends
point of view…
Thanks to Geoff Huston. [Link] on October 24, 2003
How many prefixes today?
179,903
Note: numbers
actually depends
point of view…
Thanks to Geoff Huston. [Link] on October 26, 2004
BGP-4
• BGP = Border Gateway Protocol
• Is a Policy-Based routing protocol
• Is the de facto EGP of today’s global Internet
• Relatively simple protocol, but configuration is complex and the
entire world can see, and be impacted by, your mistakes.
15
BGP Operations (Simplified)
Establish session on AS1
TCP port 179
BGP session
Exchange all
active routes
AS2
While connection
Exchange incremental is ALIVE exchange
route UPDATE messages
updates 16
Four Types of BGP Messages
• Open : Establish a peering session.
• Keep Alive : Handshake at regular
intervals.
• Notification : Shuts down a peering
session.
• Update : Announcing new routes or
withdrawing previously announced
routes. announcement
=
prefix + attributes values17
Attributes are Used to Select Best
Routes
[Link]/24
pick me!
[Link]/24 [Link]/24
pick me! pick me!
Given multiple
[Link]/24 routes to the same
prefix, a BGP speaker
pick me!
must pick at most
one best route
(Note: it could reject
them all!)
ASPATH Attribute
[Link]/16 AS 1129
AS Path = 1755 1239 7018 6341
Global Access
[Link]/16
AS 1755
[Link]/16
AS Path = 1239 7018 6341 Ebone AS Path = 1129 1755 1239 7018 6341
AS 1239 AS 12654
[Link]/16 RIPE NCC
AS Path = 7018 6341 RIS project
Sprint
[Link]/16
AS7018 AS Path = 3549 7018 6341
[Link]/16
AS Path = 6341
AT&T
AS 6341 [Link]/16
AS 3549
AT&T Research AS Path = 7018 6341 Global Crossing
[Link]/16
Prefix Originated 19
Policy-Based vs. Distance-Based Routing?
Minimizing Cust1 Host 1
“hop count” can
violate commercial
YES
relationships that
constrain inter- ISP1
domain routing.
NO
ISP3
Host 2
ISP2
Cust3 Cust2
20
Why not minimize “AS hop count”?
National National YES
ISP1 ISP2
NO
Regional Regional Regional
ISP3 ISP2 ISP1
Cust3 Cust1
Cust2
21
Shortest path routing is not compatible with commercial relations
Customers and Providers
provider
provider customer IP traffic
customer
Customer pays provider for access to the Internet
The “Peering” Relationship
peer peer
Peers provide transit between
provider customer
their respective customers
Peers do not provide transit
between peers
traffic traffic NOT
Peers (often) do not exchange $$$
allowed allowed
Peering Provides Shortcuts
Peering also allows connectivity between peer peer
the customers of “Tier 1” providers. provider customer
Peering Wars
Peer Don’t Peer
• Reduces upstream • You would rather have
transit costs customers
• Can increase end-to- • Peers are usually your
end performance competition
• May be the only way to
connect your • Peering relationships
customers to some may require periodic
part of the Internet renegotiation
(“Tier 1”)
Peering struggles are by far the most
contentious issues in the ISP world!
Peering agreements are often confidential.
Implementing Customer/Provider
and Peer/Peer relationships
Two parts:
• Enforce transit relationships
– Outbound route filtering
• Enforce order of route
preference
– provider < peer < customer
Import Routes
provider route peer route customer route ISP route
From From
provider provider
From From
peer peer
From From
customer customer
Export Routes
provider route peer route customer route ISP route
To From
provider provider
To To
peer peer
To To
customer customer
filters
block
The Border Gateway Protocol (BGP)
BGP = RFC 1771
+ “optional” extensions
RFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) …
+ routing policy configuration
languages (vendor-specific)
+ Current Best Practices in
management of Interdomain Routing
BGP was not DESIGNED. It EVOLVED.
BGP Route Processing
Open ended programming.
Constrained only by vendor configuration language
Receive Apply Policy = Based on Best Apply Policy = Transmit
BGP filter routes & Attribute Routes filter routes & BGP
Updates tweak attributes Values tweak attributes Updates
Apply Import Best Route Best Route Apply Export
Policies Selection Table Policies
Install forwarding
Entries for best
Routes.
IP Forwarding Table
30
Shorter Doesn’t Always Mean Shorter
Mr. BGP says that
path 4 1 is better
than path 3 2 1
In fairness: Duh!
could you do
this “right” and AS 4
still scale? AS 3
Exporting internal
state would AS 2
dramatically
increase global
instability and
amount of routing
state AS 1
Routing Example 1
Routing Example 2
Tweak Tweak Tweak (TE)
• For inbound traffic
– Filter outbound
routes outbound
inbound routes
– Tweak attributes traffic
on outbound routes
in the hope of
influencing your
neighbor’s best
route selection
• For outbound traffic
– Filter inbound outbound inbound
routes traffic routes
– Tweak attributes
on inbound routes
to influence best
route selection
In general, an AS has more
control over outbound traffic
Implementing Backup Links with Local Preference
(Outbound Traffic)
AS 1
primary link backup link
Set Local Pref = 100 Set Local Pref = 50
for all routes from AS 1
AS 65000 for all routes from AS 1
Forces outbound traffic to take primary link, unless link is down.
35
Multihomed Backups
(Outbound Traffic)
AS 1 AS 3
provider provider
primary link backup link
Set Local Pref = 100 Set Local Pref = 50
for all routes from AS 1 for all routes from AS 3
AS 2
Forces outbound traffic to take primary link, unless link is down.
36
Shedding Inbound Traffic with
ASPATH Prepending
Prepending will (usually)
force inbound
AS 1 provider traffic from AS 1
to take primary link
[Link]/24 [Link]/24
ASPATH = 2 ASPATH = 2 2 2
primary backup
customer [Link]/24 Yes, this is a
AS 2 Glorious Hack …
37
… But Padding Does Not Always Work
AS 1 AS 3
provider
provider
[Link]/24 [Link]/24
ASPATH = 2 ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2
primary backup AS 3 will send
traffic on “backup”
link because it prefers
customer [Link]/24
customer routes and local
AS 2 preference is considered
before ASPATH length!
Padding in this way is often
used as a form of load
38
balancing
COMMUNITY Attribute to the Rescue!
AS 3: normal
AS 1 AS 3 customer local
pref is 100,
provider
provider peer local pref is 90
[Link]/24 [Link]/24
ASPATH = 2 ASPATH = 2
COMMUNITY = 3:70
primary backup
Customer import policy at AS 3:
customer [Link]/24 If 3:90 in COMMUNITY then
AS 2 set local preference to 90
If 3:80 in COMMUNITY then
set local preference to 80
If 3:70 in COMMUNITY then
set local preference to 70
39
What the heck is going on?
• There is no guarantee that a BGP
configuration has a unique routing solution.
– When multiple solutions exist, the (unpredictable)
order of updates will determine which one is wins.
• There is no guarantee that a BGP
configuration has any solution!
– And checking configurations NP-Complete [GW1999]
• Complex policies (weights, communities
setting preferences, and so on) increase
chances of routing anomalies.
– … yet this is the current trend!
Larry Speaks
Is this any
way to run an
Internet?
[Link]