0% found this document useful (0 votes)
11 views9 pages

TCP Congestion Control Overview

TCP

Uploaded by

Praveen Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

TCP Congestion Control Overview

TCP

Uploaded by

Praveen Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Window-based Mechanism: 1

Sender Receiver
1

▪A simple protocol for reliable ACK1

TCP Congestion Control Tutorial communication is stop-and-wait


▪ Stop-and-wait send a data segment in a
2

packet and then wait for an ACK2

acknowledgment (ACK) packet.


• Path Capacity and Windows ▪ If the ACK does not arrive before the
• Bandwidth-Delay Product (BDP) retransmission timeout (RTO), the data
• Slow-Start, Congestion Avoidance segment is retransmitted.
• TCP flavors: Tahoe, Reno, NewReno ▪Stop-and-wait, delays sending a new
• Fast Retransmit/Fast Recovery segment until an ACK is received.
• Throughput formula ▪This implies a window of 1 segment in
• The TCP SACK option and Scoreboard flight.
▪ This wastes capacity

Window-based Mechanism: TCP An example with cwnd=8

Congestion window (cwnd)


▪All TCP data segments are
numbered. 1 2 3 4 5 6 7 8 9 10

▪A sliding window allows only a


set of “in flight” segments. ▪In this case, up to cwnd (8) segments can be sent
▪ after 8, the sender has to pause and wait to receive an ACK.
▪The congestion window
(cwnd) is the key method to Sender Receiver ▪When the ACK for segment 1 is received,
▪ the cwnd slides right, this allows the transmission of segment 9
control transmission. 1

▪A cwnd >1 allows multiple 3


ACK1 After ACK for packet 1
ACK2
unacknowledged segments (3 ACK3
1 2 3 4 5 6 7 8 9 10
in the example on the right).
Culmulative ACKs Congestion
▪Sending an ACK for every segment results in many packets!
cwnd

1 2 3 4 5 6 7 8 9 10 11 12 13

▪TCP acknowledgments are therefor cumulative:


▪ An ACK for segment 3, also acknowledges receipt of segments 1 and 2
▪ I.e. cwnd slides 3 positions, allowing three new segments to be transmitted ▪A network (path) has limits to how fast it can send, i.e. the capacity
▪Trying to send more results in overload, i.e. congestion
When the ACK for 3 is received:

▪ 1 2 3 4 5 6 7 8 9 10 11 12 13
▪Congestion needs to be avoided:
▪ In road networks cars are conserved, never lost or duplicated.
▪ Road speeds adapt to the load, roads choke-up
▪ TCP is actually byte-oriented
▪ ACKs report byte sequence numbers ▪ In the Internet, egress links operate at a sending rate
▪ TCP can acknowledge a part of a segments transmitted ▪ Packets arriving on a router link need to be buffered until they can sent
▪ Buffers can fill, and then packets can be dropped at a router.
▪ To keep things simple we will continue to discuss in terms of segments
▪ The path sender is responsible for detecting loss and resending lost packets.

Sharing the Link Capacity on a Path Bottlenecks and ACKs


rate after bottleneck link
“bottleneck” link

A A A A A A
C C C ACK ACK ACK ACK ACK C C C
K K K K K K

ACK rate reflects


bottleneck link rate
▪ A single flow is typically unable to use all the link capacity
▪ Flows sending packets at different times combine to use capacity
▪ There is no congestion when packets are forwarded promptly Each ACK received indicates a packet has “left the path”
▪ (i.e. buffered only for a small fraction of the path RTT)
The ACK arrival rate indicates capacity of the “bottleneck” link
N.B. True even when ACKs return on a different path!

Overflow of the bottleneck buffer Avoiding Congestion Collapse
seg4 ▪ In the 1980’s Internet Congestion Collapse was a real problem
seg
3
seg2
seg5 seg
4 1
seg

TCP bottleneck TCP


Sender ACK Receiver
ACK ACK

When more packets are received than sent


▪ A TCP Sender sends segments (packets) on a path queue builds
▪ When a router reaches capacity of the bottleneck interface, every extra received Congestion when queue exceeds buffer
segment is buffered before being forwarded.
packets discarded
▪ If router arrival rate continues to > egress rate, the queue in the buffer grows
▪ If the buffer becomes full, packets are discarded (lost). Known as a “drop-tail” router

▪ Congestion results when packets are buffered for long times


▪ Three insights:
▪ Lost packets are not acknowledged by the TCP Receiver 1. Senders need to avoid adding to congested paths (congestion control)
▪ At the TCP Sender, any loss is detected by tracking received ACKs 2. Senders need to significantly reduce under severe load (backoff)
▪ The TCP Sender needs ti reduce its rate to avoid more congestion and loss 3. Network control traffic needs to be prioritised and not dropped

Congestion Control Slow Start


A TCP Sender uses congestion control
▪ Insight: to control it sending rate, a sender needs to creating congestion This controls use of capacity by dynamically increasing or decreasing the a window:
the congestion window, cwnd, according to detected congestion.
Sender Receiver
▪ The key problem: How large fast should the sender transmit?
▪ Ideally, the number of outstanding segments should equal the path bandwidth delay product (BDP) ▪TCP increases cwnd :
▪ TCP starts from an initial cwnd RTT 1
▪ However, the available capacity of an Internet path cannot be known! ▪ It determines the available capacity

▪ Senders there need to implement congestion control! RTT 2


▪For every ACK received, the cwnd is
increased by one segment
▪ This doubles the cwnd each RTT!
BDP ▪ This results in exponential increase. RTT 3

cwnd
seg 1 seg 2 seg 3 seg 4 seg 5 seg 6 seg 7 2
2
time
Transmission ACK for 1 is
1

starts received 1
8

RTT
Slow Start and CA phases Congestion Avoidance (CA)
▪ When the cwnd > ssthresh Sender Receiver

▪ We introduce the slow start threshold variable, ssthresh ▪ cwnd increased by one every time an ACK
for a cwnd of segment is received
▪ ssthresh is initialised to a large value at the start of a connection ▪ Linearly increase of cwnd by one segment every
RTT 4 packets were
▪ Congestion control uses 2 phases: transmitted in
this RTT
▪ Slow Start: cwnd < ssthresh, the sender exponentially increases cwnd.

• Once congestion is detected: I try to transmit


• cwnd/2* is saved in ssthresh (a sender knows last RTT cwnd <= capacity) 5 packets

▪ ssthresh is uses as an estimate of the capacity for the next slow start increase
▪ Interpretation:
▪ If transmission of N segments were successful…
▪ Congestion Avoidance: cwnd>= ssthresh ▪ Then try to transmit N+1 segments to see
• The sender slowly increases cwnd to use any extra capacity whether the path has capacity to send more next
RTT.
• Together this is known as Additive-Increase Multiplicative Decrease

* Complication: Senders should use Flight_Size, rather than cwnd, to adjust SSthresh after a loss

Additive-Increase Multiplicative Decrease (AIMD) Recovery of Lost Segments

Massive
Packet losses
▪ Segments are sent as long as there is space in the cwnd
32 ▪ As previous ACKs received, the sequence of sent segments to slide right
28
▪ ... until a lost segment becomes the first in the window.
Initial
One packet
Slow-Start Congestion
24
Avoidance
loss ? ? ?
cwnd

20
21 22 23 24 25 26 27 28 29 30 31 32 33
16
ssthresh
Congestion
Avoidance
12
second
Slow-Start
▪ E.g., If segment 24 the last ACK will be for segment 23
8
▪ A sender does not know what happened to any later segments
4

RTTs
TCP Tahoe (1988) Fast Retransmit Recovery
▪ The sender records the cwnd before loss in the Slow Start Threshold (ssthresh)
▪ It resets the cwnd to one and starts growing cwnd
▪ In TCP Tahoe, Loss recovery relies on a timer to detec los
▪ All unacknowledged segments are resent (it does not know better)
▪ This is inefficient
▪ When cwnd=ssthresh, it starts linear growth (until congestion is detected) ▪ TCP has to be idle until the retransmission timer expires
▪ TCP has to retransmit any correctly received as well as lost segments

▪ A better solution uses ACKs to “self-clock” the sender:


32
One packet
loss
28
Initial
Slow-
Congestion
Avoidance
Congestion
Avoidance
Congestion
Avoidance
▪ When the receiver receives any out of sequence segment, it ACKs the
24
Start
20
Slow-
Start
last received in-order segment
▪ When the sender sees 3 duplicated ACKs (dupack), the next in-order
ssthresh segment to be ACK’ed is declared lost and Fast Retransmitted
16

8
12 ▪ TCP does not need to await for a (long) RTO!!
4

reset reset reset

The congestion control adapts the rate (aka adapts ssthresh) each time it
detects congestion.

Example of Fast Retransmit Fast Recovery (TCP Reno)

▪What do DUPACKs mean for a TCP sender?


51 52 53 54 55 56 57 58 59 60 61 62 63
▪ Three DUPACKs indicate loss, which is a congestion signal: cwnd reduced
▪ DUPACKs also indicate a segment has left the network another segment can be
Packet 54 is lost … but not the others in the window sent (a principle of packet conservation)

▪The sender can “inflate” cwnd, allowing it to send a more segments


51 52 53 54 55 56 57 58 59 60 61 62 63 during Fast Recovery

▪When the sender discovers correct reception of all segments that were
outstanding at the time when the loss, the sender resumes transmission
The receiver sends DUPACKs for 53 using normal congestion control rules (using a corrected cwnd)
for each of the later packets
Three DUPACKs trigger Fast Retransmit
TCP retransmits 54 before the RTO expiration
Example of Fast Recovery TCP Reno: using cwnd

51 52 53 54 55 56 57 58 59 60 61 62 63 ▪TCP Reno avoids slow-start at each transmission cycle

After retransmitting 54, sender shrinks cwnd by half…

51 52 53 54 55 56 57 58 59 60 61 62 63 32
One packet
loss
28
Initial Congestion Congestion Congestion
3 DUPACKs are received, indicating loss and I can immediately increase cwnd by 3… 24
Slow-
Start
Avoidance Avoidance Avoidance
Slow-
20 Start
51 52 53 54 55 56 57 58 59 60 61 62 63
ssthresh
16

12
8
When other DUPACKs are received, sender inflates cwnd to transmit new segments 4

51 52 53 54 55 56 57 58 59 60 61 62 63

Number of Segments Sent Inverted-square-root Formula

▪The previous diagram allows for a simple TCP ▪Throughput in packet/second is:
throughput calculation 3/2 n(n+1) √3/2
Tput (pps) = ~
n RTT √p RTT
▪In “n” RTTs number of segments sent is:
▪ SegSent = n + (n + 1) + … + 2n = 3/2 * n (n+1) ~ 3/2 * n2
▪ Throughput is inversely proportional to:
▪Assuming one segment is lost and retransmitted, the ▪ The RTT
fraction of segments lost is: ▪ The square root of loss-ratio
▪ p = 1/SegSent = (2/3) * 1/(n (n+1)) ~ 2/(3n2)
▪ n = 2/(3p)= √(2/(3p))
▪ The segment size can be used to calculate throughput in b/s
TCP New-Reno (RFC 2582) Further Improvements: Selective ACKs

▪TCP Reno was widely implemented, but it two problems: ▪ TCP New-Reno suffers from the performance limitation of TCP
▪ Too many lost segments in the same window of data substantially Reno when multiple segments are lost in the same window of data
increase the duration of the recovery phase
▪ What do we do when there is a complex pattern of loss?
• With N losses N RTTs are needed to recover all losses
▪ Multiple cwnd reductions can occur for the same loss episode
x x x x x x
▪TCP New-Reno partially fixed this problem in two ways:
▪ Introduced a timeout on the Fast Retransmit/ Fast Recovery Multiple losses detected in the same cwnd of segments
▪ No recovery unless all packet loss was counted.
▪ The TCP SACK option allows a receiver to specify the set of
segments that have been successfully received

▪ A sender can then retransmit only segments that have not been
acknowledged

Improvements to AIMD Conclusion


• Senders start with a conservative rate, and increase their rate
• A sending rate > Path capacity results in delay and loss!
▪ CC methods to increase cwnd will overshoot the bottleneck capacity.
• A sender detects loss and retransmits the lost segments
▪ Fills buffers before the smallest capacity (bottleneck) link on the path
▪ Continued overshoot will eventually result in loss • Tahoe was a simple method using a timer, but no longer used
▪ Triggering recovery and reduction of cwnd (a new ssthresh).
• Instead, Reno retransmits lost packets based on ACKs:
▪ Can we detect buffering before loss, and reduce this overshoot???
• FR/FR uses DupACKs to improve efficiency
Yes, newer methods can: • The TCP SACK Option helps further
• A sender detects congestion and adjusts its rate after congestion
1. Measure increases in end-to-end delay and react to this!
2. Ask routers to explicitly mark packets to tell receivers they have • Congestion Control uses a two-phase AIMD control function:
started to buffer packets. • Slow-Start and Congestion Avoidance
3. Ensure routers don’t buffer huge numbers of packets – using
active queue management to drop or mark early. • There are further improvements (Cubic, BBR, PRR, TLP, etc)

• Key take away: an adaptive control loop enables efficient transmission


SACK Details
▪ Implemented using two TCP options:
▪ The SACK-Permitted option sent at the start of a connection
▪ The SACK Blocks sent with ACKs when SACK is permitted

▪ The SACK-Permitted TCP option (2 bytes) is sent in a TCP SYN


▪ This indicates SACK can be used once a connection established

TCP Header
Spare Slides Source Port Address Destination Port Address
Header Sequence Number
Length Cumulative Acknowledgement Number
6 1 Window Size
Checksum Urgent Pointer
SYN
Kind = 4 Length = 2

SACK-Permitted

SACK Blocks SACK Blocks

The sender uses SACK options to indicate contiguous and isolated


Source Port Address Destination Port Address blocks of segments that were successfully received
Sequence Number
TCP packet Sender Receiver Receiver Buffer (held waiting
Cumulative Acknowledgement Number header SEQ 100, 200 B for other segments)
HLEN Window Size 100-300
ACK 300
Checksum Urgent Pointer

Kind = 1 Kind = 1 Kind = 5 Length ? SEQ 3


00, 20
0 byte
Left Edge of First Block SEQ
s 100-300 500-700
500, 2
00 by
tes
Right Edge of First Block SEQ
Set of SACK 700,
200 b
ytes
blocks 00
500-7
ACK
300,S
ACK
Left Edge of last Block 0-900 100-300 500-900
CK 50
00, SA
ACK 3
Right Edge of last Block
SACK Rules Sender/Receiver Algorithms: RFC 3517

▪ A SACK Block does not change the meaning of the ACK field ▪RFC 3517 specifies how to use SACK Blocks to improve
Fast Retransmit/Fast Recovery
▪ A SACK Block cannot be sent unless the SACK permitted option
was received
▪SACK sender use a scoreboard
▪ If SACKs are sent, they should be included in all packets when ▪ The scoreboard keeps note whether each outstanding segment was
out-of-order data has been buffered at the receiver received or not.
▪ The scoreboard is updated every time a SACK Block is received
▪ First segment in a SACK must acknowledge the most recently
received out-of-order segment cwnd
S S S S S

SACK Scoreboard

▪A segment in the scoreboard is considered lost if at


least three SACKs do not acknowledge it.
Lost segments Not enough SACKs
S S S S S

▪ If a segment in the scoreboard is not marked as lost, it is


considered still in flight and it is not retransmitted yet.
▪ In this example the 5 orange segments are considered still in flight

▪ TCP follows the same rules as New-Reno to update cwnd,


but segments in flight are indicated by the scoreboard

You might also like