|Finagling with Nagle: Nagle's algorithm and latency-sensitive applications|
NOTE: This post discusses a specific problem which is solved by disabling Nagle's algorithm. Do not try this at home: Nagle's algorithm is implemented in TCP/IP stacks for a reason, and I'm told that 99% of programmers who disable Nagle's algorithm do so errantly due to a lack of understanding of TCP. Disabling Nagle's algorithm is not a silver bullet that magically reduces latency.
While developing an input-only VNC client for Android to act as a remote mouse and keyboard, I noticed that mouse pointer movements were particularly jerky when connecting to a computer running TightVNC. Connections to other VNC servers yielded relatively smooth pointer movements, and using the official TightVNC client to connect to the TightVNC server was also smooth. My VNC client was not alone with this problem—connecting to the TightVNC server using other VNC clients such as Chicken of the VNC also resulted in severe pointer jerkiness. What was special about the VNC connection between the TightVNC client and the TightVNC server that made it work so much more smoothly?
To investigate, I used Wireshark to analyze the protocol communication between the client and server to see what was different. To my surprise, the RFB (VNC protocol) messages sent by the TightVNC client were basically the same as the messages sent by my client. However, there was one critical difference that existed deeper in the stack. In the TCP stream from the TightVNC client, each IP packet only contained a single RFB message. In the TCP stream from my client, several RFB messages would sometimes be batched together in a single IP packet. I immediately understood the problem.
The TCP/IP implementations of our operating systems contain a piece of code known as Nagle's Algorithm which improves performance by batching small writes into a single packet if it has not yet received an ACK for the previously sent packet. This algorithm works wonders for most applications, but very occasionally there is a need to disable Nagle's algorithm by setting the TCP_NODELAY socket option. An examination of the TightVNC client and server source code revealed that these programs do indeed disable Nagle's algorithm on each end of the TCP connection. I made a one-line change to my VNC client (socket.setTcpNoDelay(true);) and the problem went away—pointer movements became smooth as silk. The reason for the improvement is two-fold:
- The TightVNC server does not animate intermediate pointer events when they are received all at once. In my brief perusal, I couldn't find a specific location in the source code where this was happening, so it may be an efficiency hack in the operating system. (I don't know exactly how the TightVNC sends its synthetic events to Windows, but maybe it needs to do some sort of flush after each event.) By sending the messages in individual IP packets, each pointer event is processed and appears on the screen as a smooth transit.
- My use of VNC as an input-only remote control system is very latency-sensitive. The user's eyeballs are watching the remote host's screen while their thumb is moving across their phone, generating pointer events. There is an inevitable delay between the time the user's thumb moves and the time their eyeballs see the pointer move. The shorter the delay, the more natural their phone feels as an input device. The longer the delay, the more painful the experience. In this specific scenario, I judged that snappiness was more important than overall bandwidth, so I disabled Nagle's algorithm. In addition to fixing the jerkiness problem with TightVNC, to my delight this also improved smoothness to some degree with all other VNC server implementations!
This was one of the rare cases were it was appropriate to disable Nagle's algorithm. In fact, this is the only time in my 25+ years of programming that I have ever found a need to disable Nagle's algorithm. My application is only likely to be used on a local network, otherwise the disabling could cause more harm than good. If you set TCP_NODELAY on your sockets, be warned that the responsibility for being efficient with packets now falls into your hands—the operating system is not going to help out. In my case, I found several places in my code where I needed to combine multi-message transmissions into a single send() in order to fix performance lost due to disabling Nagle's algorithm.