« Previous 1 2 3 Next »
Managing Bufferbloat
All Puffed Up
Influence of Bufferbloat on TCP Operation
The vast majority of network traffic relies on TCP as its transport protocol. To fully understand bufferbloat, I'll look at the details of the TCP protocol. A handshake procedure (the three-way handshake) is used to establish a TCP connection. With its help, the individual transmission parameters are negotiated between the TCP entities involved in the connection (sender and receiver), including, for example, the initial sequence numbers. If an FTP server needs to transfer a large file, the TCP protocol usually begins its transfer by transmitting four TCP segments. The sender then waits for correct confirmation of receipt of these packets. Usually, reception is confirmed by transmitting an acknowledgement. Once the four segments have been acknowledged, the receiver increases the send rate by transmitting eight segments and waiting for them to be confirmed. If this is successful, the send window is set to the value 16. Afterward, the transmission window can be increased even further following the same principle.
The first connection phase is known as TCP slow start. The packets are deliberately sent more slowly at the beginning of a session to avoid an overload situation. Also at the beginning of the connection, the sender and receiver negotiate a suitable window size based on the receiver's buffer size, which should prevent a buffer overflow on the receiving side. Nevertheless, an internal network overload can occur at any time. For this reason, the sender first starts with a small window of 1 maximum segment size (MSS). Once the receiver acknowledges receipt, the sender doubles the size of the window, resulting in exponential growth and either continues until the size of the receiver's maximum window is reached or a timeout occurs (receiver fails to acknowledge receipt). In this case, the window size drops back to 1 MSS and the whole game of doubling the window size starts all over again.
Overload control (congestion avoidance) uses another value, the slow-start threshold (ssthresh). Once this is reached, the window only grows by 1 MSS. Instead of exponential growth, growth is linear. It also only increases until the receiver's maximum window size is reached or a timeout occurs. If the sender detects that packet losses have occurred on the route, it reduces its transmission rate by half and initiates a slow start. This process dynamically adapts the TCP rate to the capacity of the connection.
How Bufferbloat Disrupts Traffic
To illustrate the bufferbloat mechanism, I will use a high-speed connection and a connection with a slower link. Suppose I have a 1Gbps connection (CATV or DSL) that provides 10Mbps in the download direction and 2Mbps in the upload direction. A classic FTP server fills the buffer over the fast download connection faster than the send rate in the slower upload direction can confirm the received packets. Ultimately, the acknowledgements (ACKs) from the slower direction determine the total throughput of the connection. However, if the buffers are too large, several things can happen:
- When the buffer fills up, the last incoming packet can be deleted. This is known as a "tail drop." The confirmation informing the sender that the packet has been dropped is not transmitted until the next valid packet is received (which must arrive after the dropped packet). With large buffers, this can take a considerable amount of time. Experiments showed that nearly 200 segments were received before the transmitting station retransmitted the lost segment.
- If several traffic flows are handled over one connection, a kind of permanent queue develops. A fixed number of packets is therefore always in the queue. If there are not enough packets to fill the buffer provided for this purpose, no packets are dropped and TCP congestion control is not lifted. However, the delay for all users of the buffer increases.
Buffer Management
To prioritize specific traffic, the differentiated services (DiffServ) bits of the IP layer can be used to implement preferential transmission for specific traffic types (e.g., network control or VoIP). Ultimately, DiffServ is used to class the respective traffic, but this does not eliminate the bufferbloat problem, because some of the queues responsible for transferring non-prioritized traffic can still be too large and therefore contain many large TCP segments. Consequently, the effect on the TCP congestion mechanism persists.
Formerly, several active queue management efforts used random early discard (RED, also known as random early detection or random early drop) and weighted RED (WRED), which ensures that certain packets were discarded when the buffer reached a critical level (but was not yet full). In practice, however, these techniques had some bugs, and RED was difficult to configure. As a result, use of the RED and WRED mechanisms was discontinued, and automatic configuration methods were sought.
The controlled delay (CoDel) method controls the time a packet is in the queue. Two parameters are used for this: an interval and a threshold. If the interval value of the stored packets exceeds the delays to the target, packets are randomly deleted. This deletion technique does not depend on the size of the queue, nor does it use the tail drop mechanism. Tests have shown better delay behavior and far better throughput results than the RED method, especially for wireless connections. Furthermore, CoDel technology can be easily implemented on the hardware side.
The flow (or fair) queuing controlled delay (FQ-CoDel) method (RFC 8290 [1]) divides the queue into 1,024 additional queues by default. A separate queue is then randomly assigned to each new data flow. Within each subqueue, CoDel is used to eliminate delay problems in case of TCP overloads.
The various queues are resolved by the deficit round robin (DRR) mechanism. First, this procedure ensures that TCP overload control is working properly. Moreover, by mixing the packets in the queues, small packets (e.g., DNS responses and TCP ACKs) are no longer trapped in large queues, so processing of large and small packets are fairer.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.