How to Troubleshoot Throughput and TCP Windows

(14) comments

Michael July 30, 2014

So, I’m looking through the capture file, and I’m seeing a two byte packet go out after it receives ACKs for the first 64k. Most of the time it’s including those two bytes with the final 128k ACK packet. In the cases where it’s only sending 64k at a time, that two byes has not been included in the final ACK. As for why the server isn’t ACKing those two byte packets in a timely manner, I’m really not sure.

Kary July 30, 2014

Nice, Michael. That is indeed the pattern that triggers it. Clearly having that 2 byte segment unACKed causes the sender to stop after 64k+2 bytes and wait for a single ACK. Then it sends the second 64k chunk. It’s not clear to me why it pauses when it can send more than 64k bytes in the other times it sends 128k. But something about that 2 byte segment being unACKed causes it to pause. Good eye!

Understanding Throughput and TCP Windows July 30, 2014

[…] How to Troubleshoot Throughput and TCP Windows […]

chrismarget July 31, 2014

What’s the sending stack?

The PSH bit being set on the end of that 2-byte segment suggests to me that it falls at the end of a socket send() operation.

My suspicion is that there’s a zero-copy mechanism working inside the sending stack: On paper, ACKed data should be purged from the sender’s send buffer segment-by-segment as ACKs roll in, guaranteeing that new data from the sending application is always available for the stack to send.

Zero-copy mechanisms don’t copy data into the socket buffer. Instead, they say “hey, stack! here’s a pointer to a block of data I’d like you to send!” These mechanisms are nice in that they avoid the copy operation (speedy!), but they quickly expose tuning problems because the stack winds up working with very large chunks of data as atomic units. The stack can’t clear individual segments from its “buffer”, because it’s trading pointers to very large chunks of memory. The block of memory and the pointer are both busy until the last ACK for that chunk is received.

If there’s enough memory, and enough pointers, then we’ve effectively got a windowing mechanism (inside the stack) exactly like TCP’s byte-based windowing. You’ll never know that pointers to these large data chunks are cycling around inside the sender.

If these resources inside the sender are scarce, then you run into ugly business like this.

Sometimes the application is complicit in this scheme (google: “io completion ports” for an MS Win example), and sometimes not.

Kary July 31, 2014

Very good info, thanks, Chris. The sending stack is Win7. Don’t know the exact version; hafta check. iperf is calling write() with 128k of data to a 64k send buffer. In my research, I’ve seen that the kernel will fudge a bit on the actual buffer size so I assumed this was why it would put more on the wire than the send buffer size. But perhaps the zero-copy behavior is why it will put 128k on the wire (except when the 2 byte segment is unACKed) when there’s only a 64k buffer. I’ll research io completion ports. Thanks for the tip!

David Zhang December 31, 2014

Hi Kary,
Great post!!! Very helpful. Could you please share the second packet capture which you refer in the video?

Best Regards,

Lleyton October 23, 2015

Dear Kary,

Thanks. This is very helpful. Could you pls share the 2nd pcap? The Rx one.

Kary October 25, 2015

Sure, added to the post

Brad May 24, 2016

Michael/Kary,

For the purpose of learning, could you provide a step by step explanation on how you came to that conclusion?

TCP Performance Options - Chris Sereno October 30, 2017

[…] http://packetbomb.com/how-to-troubleshoot-throughput-and-tcp-windows/ […]

alan August 27, 2018

interesting… can you share what version of iperf & the params it was executed with? I guess what’s concerning to me (in terms of throughput performance) is that the although the receive window hovers around ~1MB throughout the session, the client doesn’t come close to reaching that. maybe i’m missing something, but the two-byte behavior doesn’t seem overly “painful”, even for a endpoint roughly 98ms apart.

Kary August 27, 2018

I’m afraid I don’t recall but I might’ve set the receive window to 1M and the sender side window to less than that to illustrate how the send buffer side on the sender can affect throughput.

Jack Wei September 17, 2018

Hi, Kary. I really like your video on YouTube. How can I contact you more directly? I visited your Facebook site, but it seems that it’s been a long time you didn’t post anything.

Kary September 22, 2018

My email is at the bottom of every page

Add Your Reply

How to Troubleshoot Throughput and TCP Windows

Your Turn Challenge

Share this post! Spread the packet gospel!

Related

About the Author

Leave a Comment:

(14) comments

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment:

Leave a Comment: