02-FEB-2011: Why is there packet loss ?

02-FEB-2011: Why is there packet loss ? r2 (See the current copy)

Is the Internets dying ?

Bob,

Here's a little justification, explanation, and plan of action for our experiment with Quality of Service.

Late last week, when the SAN team began to use unused bandwidth (while not exceeding our link capacity) we experienced packet loss between the datacenters. Packet loss is caused by one of two things:

A device or transit (e.g. cable or repeater) malfunctioning;
A queue being full (or nearly full):
1. Either an interfaces outbound queue;
2. A devices global queue; or
3. Random Early Detection (RED) signaling that a queue (one of the above) is nearing fullness

Given the general reliability of modern network devices, and the fact that the packet loss stopped once we reduced the amount of traffic we were transmitting across the network I think we can eliminate a device malfunction as a cause that we should attempt to address.

This leaves us with a queue being full or nearing fullness. This queue may be on a device inside our network (e.g., the firewall) or within the WAN network (e.g., a router or switch on our provider's network).

First, let me expound upon:

Why packet loss slows down a TCP stream;
Why queues being full (or nearly full, causing higher RED probabilities) leads to packet loss is our problem; and
Why queues being nearly full leading to increased latency leading to TCP connections slowing down is not our problem.
Transport Control Protocol (TCP) is a network protocol that

(among other characteristics) guarantees delivery and ordering of packets within a socket stream. In TCP, packets that are transmitted by one side are acknowledged by the other side. This acknowledgment is done by transmitting a packet to the sender indicating which packets were received in a given "acknowledgment window" (range of bytes). If the receiver does not receive enough packets to construct the entire range of bytes that the acknowledgment window covers it will not send this acknowledgment. If the acknowledgment is not received by the sender in a defined amount of time (various algorithms exist, but we will just assume 2 * AverageRoundTripTime) either because the acknowledgment packet was lost due to packet loss, or because it was not sent because packet loss caused a packet to be missing from the receiver's acknowledgment window the sender will resend all of the packets in the acknowledgment window that was not acknowledged. Because TCP guarantees ordering of packets, even over medium that do not guarantee this (i.e., ethernet) the receiving side of a TCP socket must have a buffer available to re-order incoming packets. This buffer is naturally of finite size. Thus, if packets have been lost, we must wait for them to be retransmitted before we can emit any later packets we may have to the socket and evict them from the buffer. The size of the receiver's buffer, therefore defines how many packets can outstanding/unacknowledged at a given time and the TCP window size. Once the sender has sent enough packets to fulfill the TCP window size without a contiguous range starting from the last cleared acknowledgment window being acknowledged, it will stop transmitting until further data is acknowledged and buffer becomes available. If we presume that AverageRoundTripTime is 100 ms, it will wait 200 ms for an acknowledgment, in the presence of packet loss it will then retransmit