Understanding Throughput and TCP Windows – PacketBomb

Understanding Throughput and TCP Windows

By Kary | text

Jul 30

Throughput is generally measured in the amount of data sent over time e.g. bits or bytes per second. Sending more bits in a shorter amount of time equals higher throughput. So let’s talk about some of the factors that control how much data can be sent in a given time period.

Here’s some data represented as a stream:

data-stream

The amount of data we can send is the minimum of:

  • the amount the receiver says it can receive and
  • the amount the sender thinks it can send

That makes sense, right?

Receive Window

The receiver advertises to the sender how much data it can receive and buffer. This is representative of the free buffer space for the socket (SO_RCVBUF).

So for the data stream, the amount of data that can be received from the receiver’s perspective is:
recv-win

It’s pretty straight forward. The amount of free space in the receive buffer is advertised to the sender in every ACK packet as the Window Size.

wireshark_recvwin

Send Window

The amount of data the sender can send is more complicated. The upper bound is the receiver’s advertised window, the sender can’t send more than that or data will be discarded. Here are some factors to consider:

  1. The amount of unacknowledged data already sent i.e. bytes in flight
  2. The congestion window (cwnd)
  3. The send buffer size (SO_SNDBUF)

Let’s examine each of these.

Bytes in Flight

Bytes in flight is the amount of data that has been sent but not yet acknowledged. If the receiver’s window is 64k and we’ve sent 48k that hasn’t yet been acknowledged, then we can only send 16k more before we fill the receive window. Once we receive an ACK with an updated Window Size, we can send more data.
unacked-win

Congestion Window

The congestion window (cwnd) is the sender’s flow control that is based on the network capacity and conditions. It is usually referred to in multiples of maximum segment size (MSS). So an MSS of 1460 and a cwnd of 33 would be ~48k bytes. The cwnd at the beginning of a connection is usually 2, 3, or 10 depending on the operating system and kernel version. The cwnd is initially increased by TCP Slow Start. Read more at PacketLife’s excellent post on Slow Start. Once the cwnd reaches the Slow Start threshold (ssthresh) or there is data loss due to congestion, the cwnd growth changes to a congestion avoidance algorithm.

Eventually the congestion window will increase up to either the network’s limit due to congestion or hit the receiver’s window limit. Even if the receiver’s window (rwnd) is 64k, the sender is bound by the cwnd. It might be that the current network conditions do not support having 64k of outstanding data buffered in the network. In this sense, the amount of data the sender can send is the minimum of the rwnd and the cwnd.

cwnd

Send Buffer

Send buffer size is the size of the socket send buffer. This is the buffer that the application writes data to for TCP to send. It the application doesn’t specify a size, a default size is used. The optimal send buffer size depends on the bandwidth delay product (BDP) i.e. how much data the network can buffer as a product of bandwidth and latency. Let’s see what MSDN has to say:

When sending data over a TCP connection using Windows sockets, it is important to keep a sufficient amount of data outstanding (sent but not acknowledged yet) in TCP in order to achieve the highest throughput. The ideal value for the amount of data outstanding to achieve the best throughput for the TCP connection is called the ideal send backlog (ISB) size. The ISB value is a function of the bandwidth-delay product of the TCP connection and the receiver’s advertised receive window (and partly the amount of congestion in the network).

Now here’s the important part:

Applications that perform one blocking or non-blocking send request at a time typically rely on internal send buffering by Winsock to achieve decent throughput. The send buffer limit for a given connection is controlled by the SO_SNDBUF socket option. For the blocking and non-blocking send method, the send buffer limit determines how much data is kept outstanding in TCP. If the ISB value for the connection is larger than the send buffer limit, then the throughput achieved on the connection will not be optimal.

For instance:

If the bandwidth is 20Mbps and the round trip time (rtt) is 40ms, the BDP is 20000000/8 * .04 = 100k. So 100kB is the maximum amount of data that can be in transit in the network at one time.

If the receive window is 64k and the cwnd opens up to 48k, but the send buffer is 32k, we’re not able to fill the available send window of 48k. In this case we’re limited by the send buffer size.

send_buf

Summary

Many factors control the sender’s throughput. The sender can’t send more data at one time than the advertised receive window. The sender can’t send more data at one time than the congestion window. The sender can’t send more data at one time than is available in the send buffer.

One thing we didn’t talk about in detail is latency. These factors’ impact is lessened by low latency and increased by higher latency. The round trip time can make or break the performance depending on these other factors.

The receive window is right there in the TCP header, but cwnd and send buffer size aren’t. So how do we know which factor might be limiting throughput? I’ll show you how in the link below.

Analysis

Ok! Let’s look at some real examples of throughput being limited by these factors.

How to Troubleshoot Throughput and TCP Windows

Share this post! Spread the packet gospel!

Facebooktwitterredditlinkedinmail
Follow

About the Author

I like being the hero. Being able to drop a bucket of root cause analysis on a burning network problem has made me a hero (to some people) and it feels real good, y’all. Get good at packet analysis and be the hero too. I also like french fries.

Leave a Comment:

(18) comments

[…] sure you’ve read Understanding Throughput and TCP Windows before watching this video. I mean, you don’t HAVE to, but I recommend […]

Reply
Tom October 3, 2014

Just getting into your site, very interesting articles. I’m not going to pretend like I understand everything after the first read but it covers topics I have been very interested.

The idea of throughput has always been interesting to me from the practical side. I am interested in determining how to maximize usable throughput for simple things such as data transfer based on which application you use.

The twist is for me, is that I operate in a world where latency can be very high hundreds to thousands of ms and the networks are lossy. These massively variable factors result in systems where calculating “optimal” settings for all cases is pretty much impossible, but reading articles like this help me get a better understanding of what knobs there are available to turn to experiment with.

Thanks again for the article, look forward to reading/watching more of your content.

Reply
    Kary October 5, 2014

    Thanks for the feedback, Tom. I have some thoughts on tuning performance in those types of environments. I’ll reach out to you via email.

    Reply
Mike January 6, 2015

Kary,

I’ve watched your videos and read your articles. Great job bud. You’ve introduced me to the tcptrace streamgraph that I can’t believe I wasn’t familiar with before reading your website. It’s definitely a great tool within Wireshark. With that said, I’ve seen some patterns lately that have me a bit baffled and was wondering what your opinion is. Here is a picture: http://i.imgur.com/sQ4zVRA.png

This session seems to be hitting the TCP window size ceiling, but if I’m reading the graph correctly, doesn’t this indicate throughput is ABOVE the TCP window size?

Again, great site, and I hope you keep contributing the high quality (and humorous) content.

Reply
    Mike January 6, 2015

    I believe I answered my own question. I was looking at this stream graph from the sender side. Duh!

    Reply
witoon September 15, 2015

Hi, thank you for a article. I have something that I don’t understand about Bytes in Flight.

Why bytes in flight limit the number of bytes that can be transferred at “one time and wait for ACK”, I mean why TCP doesn’t send 64KB at one time and wait for ACK, why it sent only 48KB and later sent 16KB before wait for ACK?

Reply
    flz March 29, 2016

    Not all bytes are generated at once

    Reply
fred October 18, 2015

none

Reply

[…] Page referenced: Understanding Throughput and TCP Windows […]

Reply
flz March 29, 2016

“If the bandwidth is 20Mbps and the round trip time (rtt) is 40ms, the BDP is 20000000/8 * .04 = 100k. So 100kB is the maximum amount of data that can be in transit in the network at one time.”

It seems little strange that the more latency have in the network the more byte can be send at one time? For example, the RTT is 1s, then the maximum data can be send is 20000000/8 * 1 = 2500k ?

Reply
    Kary March 29, 2016

    Think of latency as the distance between two hosts. Or think of it as the length of the pipe with a ball as the packet. The longer the pipe, the more balls I can fit into it before they fall out the other end. BDP is how many balls I can fit into the pipe to fill it up.

    Reply

[…] it took to receive the whole response. The packets of data will have been sent in batches, known as congestion windows, one window at a time, one after the other, with a slight pause at the end of each window for the […]

Reply
Rahul January 20, 2017

Well written! Delightful! I’m gonna rest more of your site as a revision.

Cheers

Reply

[…] The actual bandwidth is determined by the TCP window size and the connection latency. See here for more information on TCP window and […]

Reply
Matt Rome December 16, 2021

Thank you sir!

Reply

[…] is calculated using a number of factors such as network protocol, host speed, network path, and TCP buffer space, all of which are affected by the bandwidth available. With Netest, users can use a tool that can […]

Reply
JOHN MCKEAGUE January 10, 2023

I love the analogy. But the BDP is how many balls I can fit into the pipe times 2. The sender is waiting for an ACKs. The ACK for the first ball takes time to get back to the sender so he needs to keep sending and will be able to send the same number of balls again before the first ACK arrives.

Reply
Add Your Reply

Leave a Comment: