Fork me on GitHub

Caffeinated Bitstream

Bits, bytes, and words.

Lua and Squirrel overhead February 4, 2012

I've been researching the idea of using embedded languages in mobile applications as a way of reusing business logic across platforms. I haven't found a lot of information about how much an embedded language will bloat an app's size, so I decided to see for myself. So far, I've written simple "Hello, world" apps for both Lua and Squirrel. Lua is a simple language that has been heavily used in video games for years. Squirrel is a newer language that was inspired by Lua, but uses a more C-like syntax.

These tests are not very scientific, and only demonstrate the bare minimum task of including the language support as a native shared library, and some JNI code to run a script to generate a "Hello, world" message which is returned to the activity.

Lua and Squirrel app delivery overhead (.apk size differences)
language start size final size overhead
Lua12817 (13K)60089 (59K)47272 (46K)
Squirrel13530 (13K)118520 (116K)104990 (103K)
Squirrel (sans compiler)13530 (13K)99598 (97K)86068 (84K)

I'm frankly blown away by the compact size of these language implementations, especially after getting the impression that including Javascript (via Rhino) would cost many hundreds of kilobytes. That wouldn't be a problem for many apps, but for certain small apps, Rhino could end up being much larger than the app itself. In the case of Lua, which is implemented in a mere 20 C source files, you not only get the Lua virtual machine in 46K, but the compiler to boot! Developers can and do use Lua on other platforms such as iOS, and the Lua code has even been compiled to Javascript via emscripten (an LLVM Javascript backend), which adds the potential of reusing code in HTML5 apps.

I haven't played around with writing code in these languages, though, so I'm curious to hear about people's real-world experiences.

Using a Mac keyboard in Ubuntu 11.10 with Mac-like shortcuts October 16, 2011

I'm trying out Ubuntu 11.10 (Oneiric Ocelot) on a PC with a Mac keyboard attached. I made a few hacks to make the keyboard work smoothly and in a (very roughly) Mac-like fashion. I figured I'd make a few notes here for my own future reference. (Note: I'm using a U.S. keyboard. If you are using a different kind of keyboard, your mileage may vary.)


  1. Make the function keys (F1..F12) work as function keys without needing to hold down the Fn key.
  2. Use Mac-like keyboard shortcuts for window navigation (Cmd-Tab, Cmd-`) and the terminal (Cmd-C for copy, Cmd-V for paste).
  3. Avoid stepping on Unity's use of the Super key (i.e. the command key on Macs and the Windows key on PC keyboards).
  4. Use the legacy Caps Lock key for something useful.

The plan

  1. Change a driver parameter to enable use of the function keys without holding down the Fn key.
  2. By default, the keyboard's left and right command keys are mapped to Super_L and Super_R. Map these instead to the seldom-used Hyper_L and Hyper_R keysyms. (If you try to use the Super keys for shortcuts, the Unity dock will appear every time you hold down the command key. It's really annoying.)
  3. Map the Caps Lock key to Super_L so it can be used for certain Unity shortcuts.

Making function keys work

Create a file in /etc/modprobe.d which sets the fnmode parameter of the hid_apple driver to 2 = fkeysfirst:

echo 'options hid_apple fnmode=2' > /etc/modprobe.d/apple_kbd.conf

Reboot, and the function keys will work without needing to hold down the Fn key. (You can access the volume controls and such by holding down the Fn key.) Thanks to Alan Doyle for reporting on this tweak.

Remapping the keys

I used the xkbcomp utility to remap the keys. I extracted the current keyboard mappings into a default.xkb file, made a copy of the mapping file as mackeyboard.xkb, made the changes to this file, then loaded the new mapping into the running X server:

xkbcomp :0 default.xkb
cp default.xkb mackeyboard.xkb
vi mackeyboard.xkb
xkbcomp mackeyboard.xkb :0

I'm attaching my mackeyboard.xkb file and the diff for reference. (Use these at your own peril.) I made the following changes:

  1. Changed the LWIN and RWIN keycode identifiers to LCMD and RCMD, for clarity.
  2. Commented out the LMTA and RMTA keycode aliases, to avoid confusion.
  3. Changed the CAPS keysym mapping from Caps_Lock to Super_L.
  4. Changed the LWIN and RWIN (now LCMD and RCMD) keysym mappings from Super_L and Super_R to Hyper_L and Hyper_R.
  5. Changed the modifier mapping so that only the CAPS keycode is used for Mod4. Since Mod3 wasn't previously in use, I mapped Hyper_L and Hyper_R to this modifier.

Configuring new shortcuts

In System Settings -> Keyboard -> Shortcuts, configure these shortcuts:

SectionShortcut nameKey
NavigationSwitch applicationsCmd+Tab
NavigationSwitch windows of an applicationCmd+`
WindowsToggle fullscreen modeCmd+Return
WindowsClose WindowCmd+Q

In Terminal's Edit -> Keyboard Shortcuts, configure these shortcuts:

SectionShortcut nameKey
FileNew WindowCmd+N
FileClose WindowCmd+W
ViewZoom InCmd+=
ViewZoom OutCmd+-
ViewNormal SizeCmd+0

I think the biggest benefit of the new Terminal shortcuts is the use of sensible copy and paste shortcuts that don't interfere with using Ctrl-C and Ctrl-V in the shell.

Future hacks

The following improvements are left as an exercise for the reader:

  • Have xkbdcomp load the new mapping every time you log in, so you don't have to run it manually.
  • Make other applications (such as Google Chrome) recognize Mac shortcuts such as Cmd-C and Cmd-V.
  • Figure out a generic way for specifying key translations for specific apps that happen to be in the foreground, similar to the functionality that AutoHotkey provides for Windows. (compiz plugin? resurrect the deprecated XEvIE X11 extension?)

Update, November 7, 2011: AutoKey

In the comments, Nivth Ket brought to my attention the AutoKey tool for mapping arbitrary keys to other keys, phrases, or even Python scripts. This tool seems to use the XRecord extension to X11 to listen to incoming keys. I gave AutoKey 0.80.3 a test drive, and found a few limitations that clashed with my needs. However, with a few hacks, I think I've overcome these limitations and found a solution that seems to work for me so far. The limitations and workarounds are as follows:

  • The AutoKey GUI does not allow assigning the same hotkey to multiple actions. This prevents me from assigning a key combination to do one thing in a particular application (i.e. the window title matches "Google Chrome"), and something else in every other application. The workaround is to edit the configuration files in ~/.config/autokey/data directly.
  • AutoKey does not have a notion of order semantics for the entries — the entries are processed in a seemingly random order. Therefore, if my entry for "Cmd-V with no window filter" happens to come before my entry for "Cmd-V only for Terminal windows", the former will eclipse the latter, and the Terminal-only rule will never happen. My workaround was to hack AutoKey to always process entries with filters first, then process entries with no filters. Here is the patch.
  • AutoKey does not support the little-known "Hyper" modifier key, which I use in my layout for the "command" keys. My workaround was to hack AutoKey to support the Hyper modifier. Here is the patch.


  • mackeyboard.xkb - The xkb file for my keyboard, suitable for loading into a running X server with xkbcomp.
  • mackeyboard.diff - The changes I made to the original keyboard mappings.
Apple Remote Desktop quirks September 19, 2011

While developing Valence, an input-only Android VNC client for remote controlling a computer, I've encountered several notable quirks in Apple Remote Desktop, Mac OS's built-in VNC server. Apple Remote Desktop (ARD) is based on VNC, a system developed in the late 1990's for controlling a remote computer, and its Remote Framebuffer (RFB) protocol. Generally, standard VNC clients can interoperate with ARD. An ARD server reports use of version "3.889" of the RFB protocol, which isn't a real version of RFB, but this version number can be used by clients to know that they are talking to an ARD server and not a conventional VNC server.

ARD authentication

During the RFB handshaking, a VNC server will announce which authentication methods it supports, and the client then picks from among these. Most VNC servers support a scheme known as "VNC authentication," which is a simple DES-based password challenge/response system. ARD offers VNC authentication, but it also offers a proprietary scheme which allows the user to also supply a username. This scheme is known as "Mac authentication" or "ARD authentication." Mac OS X 10.7 Lion includes an important change to the way ARD works: If you connect with VNC authentication, you are presented with a login screen which prompts you for a username and password before allowing you to control the desktop. If you connect with ARD authentication, this login screen is bypassed and you can immediately control the desktop.

Because Valence is an input-only VNC client, the login screen is a show-stopper: since the login screen cannot be seen, the user cannot login and control the desktop. The only way for Valence to support Lion's built-in VNC server was to add the ARD authentication scheme as an option. Apple has a support article which gives a high-level overview of the ARD authentication process, but it unfortunately does not contain the technical detail needed to implement the scheme. I was able to discover the technical details by studying the gtk-vnc open-source library, which implements ARD authentication thanks to a patch provided by HÃ¥kon Enger last year. (I'm not sure how Mr. Enger figured out the technique, but I'm grateful.)

The basic steps for performing ARD authentication are as follows:

  1. Read the authentication material from the socket. A two-byte generator value, a two-byte key length value, the prime modulus (keyLength bytes), and the peer's generated public key (keyLength bytes).
  2. Generate your own Diffie-Hellman public-private key pair.
  3. Perform Diffie-Hellman key agreement, using the generator (g), prime (p), and the peer's public key. The output will be a shared secret known to both you and the peer.
  4. Perform an MD5 hash of the shared secret. This 128-bit (16-byte) value will be used as the AES key.
  5. Pack the username and password into a 128-byte plaintext "credentials" structure: { username[64], password[64] }. Null-terminate each. Fill the unused bytes with random characters so that the encryption output is less predictable.
  6. Encrypt the plaintext credentials with the 128-bit MD5 hash from step 4, using the AES 128-bit symmetric cipher in electronic codebook (ECB) mode. Use no further padding for this block cipher.
  7. Write the ciphertext from step 6 to the stream. Write your generated DH public key to the stream.
  8. Check for authentication pass/fail as usual.

For further reference, my Java implementation of ARD authentication is available on GitHub.

Right-click problems in the Snow Leopard ARD v3.5 update

In July 2011, Apple pushed an update to Snow Leopard users which included a newer version of Apple Remote Desktop, the built-in VNC server. This new version, v3.5, interprets mouse buttons differently—the RFB "button-2" event is now used to indicate a right-click instead of the "button-3" event which is used on standard VNC implementations. This broke Valence's support for sending a right-click when the user performs a two-finger tap. To restore right-click support, I had to add a "send mouse button-2 instead of button-3" option to Valence's server configuration.

It is unclear to me why ARD v3.5 does this in Snow Leopard, why this problem doesn't exist in Lion, and why this problem doesn't appear when using the ARD client.

Foreign keyboard key mapping

Valence can be used to send international keys to VNC servers running on Linux or Windows with no problem. However, I was surprised to discover that these key events were often not correctly consumed by Macs. Upon investigation, I learned that the problem was worse—using a foreign keyboard layout can cause the wrong keys to be consumed by the Mac. The root cause appears to be the lack of support in Mac OS for allowing applications (such as a VNC server) to inject synthetic keysyms (symbolic representations of keys). Only injection of physical keycodes (numeric representations of actual keys on a keyboard) is supported. (Note that I'm borrowing "keysym" and "keycode" from X11 terminology, but most systems have equivalent concepts.)

Operating systems generally provide layers of abstraction around keyboard input, and some background is required to understand the problem. Keyboard hardware sends a "scancode" to the computer for each key pressed, using a scheme that dates back to the electrical configuration of rows and columns in early keyboards. These scancodes are then translated into more useful values one or more times before being provided to an application. On a Linux/X11 system, for instance, the scancodes are converted to "keycodes" which are simple representations of each physical key on the keyboard. The scancode to keycode mapping allows computers to use keyboards with different scancode conventions. For example, the 7th key from the left on the top letter row may be represented by different scancode bytes on different keyboards, but the operating system will always translate that key to the same keycode. The keycode, which still represents a physical key, is then mapped into a virtual value known as a keysym, based on the configured keyboard layout. For example, that 7th key on the top letter row would result in a keysym for "Y" when using a U.S. keyboard layout, but it would be translated into a "Z" when using a German keyboard layout. These abstraction layers allow applications to be written without needing to consider the keyboard electronics or the arrangement of keys on the keyboard. When the application receives a "Y" keysym, it knows that the user intended to enter "Y" regardless of which physical key they pressed to make it happen.

VNC clients send symbolic key representations, to avoid any of the headache with keyboard layouts. (The symbolic key representations used in the RFB protocol are defined to be identical to the X11 keysyms, but are trivially translatable into symbolic key values for other systems.) When a user presses a "Y", the VNC client sends 0x0079, which is the keysym for "Y", regardless of which physical key was pressed. The VNC server then injects a synthetic key event with the appropriate symbolic key for "Y" into the input system of the server. A user using a VNC client with a U.S. keyboard and a user using a VNC client with a German keyboard would both be able to type on a server with any defined keyboard layout without noticing any problems, because the VNC server injects the symbolic key and bypasses the whole messy business of keyboard layouts.

That's the theory, anyway. Unfortunately, since Mac OS doesn't seem to support symbolic injection, only physical keycode injection, the whole system falls apart. Here's what happens when an American user presses the 7th key of the top letter row ("Y" on the U.S. keyboard) while using a VNC client to control a server configured with a German keyboard:

  • Linux and Windows. The VNC client sends 0x0079, the VNC symbolic key representation for the letter "Y". The VNC server receives this event, and says, "I see you want to press the letter Y on this computer. I'll send the symbolic key for "Y" directly to the running application. We won't worry about what kind of keyboard is physically attached to the server... it doesn't matter!" The application receives the keysym for "Y", and all is well.
  • Mac OS. The VNC client sends 0x0079, the VNC symbolic key representation for the letter "Y". The VNC server receives this event, and says, "Gosh, the operating system's event API only lets me send physical key events, so I'll have to translate this into a physical keycode. I wonder how I can do that... Oh, I have a built-in list of how keysyms translate to keycodes for American keyboards, I'll use that! It looks like I should send the physical keycode representing the 7th key of the top letter row. Done!" After the VNC server injects the keycode, the operating system then performs its task of translating the physical keycode back into a keysym to be delivered to the application. Unfortunately, since the operating system is configured to be using a German keyboard layout, the keycode is translated into the keysym for "Z" instead of "Y". The user is shocked to see a "z" appear on the screen after he typed a "y"!

A sufficiently smart VNC server perhaps could try to work around this problem by (somehow) being aware of which keyboard layout is in use on the server, and having the information needed to translate keysyms to keycodes for all possible layouts. This may not even be feasible, but even if it is, it doesn't solve the problem: the remote user would still be unable to type characters that are not physically present on the server's keyboard.

A brief survey of open-source software that deals with key injection on Mac OS shows that my Valence app isn't alone in suffering from this issue—they all do. In fact, even Apple Remote Desktop cannot handle this correctly! The usual advice for using an ARD client to control an ARD server with a different keyboard mapping is to change the keyboard mapping on either the client or the server.

I have no solution for the foreign key problem at this time.

Rapid DHCP Redux July 17, 2011

I was surprised at the amount of attention attracted by my recent post,"Rapid DHCP: Or, how do Macs get on the network so fast?". Between the 27 comments on my post and the 180 comments on Hacker News, a lot of interesting insights surfaced about the Mac's approach to DHCP. Information that would have taken me a week or two to research arrived within hours from people with experience in these matters. Here are some of the highlights:

  • The scheme Apple uses to achieve rapid network initialization is documented in RFC 4436: Detecting Network Attachment in IPv4 (DNAv4), which was authored by internet engineers from Apple, Sun, and Microsoft.
  • The scheme also seems to be documented in Apple's patent application (pub. no.: US 2009/0006635 A1). A patent has not yet been granted at this time. No Intellectual Property Rights (IPR) disclosures have been filed with the IETF concerning RFC 4436.
  • There's a minute chance for an address collision if the DHCP server loses its lease information after a reset. Such a collision should sort itself out quickly, but may cause a minor disruption to one or both of the hosts competing for the address. Many commodity routers may contain embedded DHCP servers that lose their lease information when the router is powered off. There is some debate over whether it is appropriate for implementors to risk this situation for a great speed benefit, or if they should take the strictly conservative route of accommodating such broken network scenarios.
  • I can't say for certain, but it seems that this process occurs in user-space in Apple's bootp package. (Thanks to everyone for the pointers to Apple's open source code.)
  • The DHCP server on my test network was not set up to be authoritative, so it wasn't prompting sending NAKs in response to bogus requests. Fixing this problem considerably improved the Galaxy Tab's DHCP time, although it (like many other devices) is still pokey compared to Apple's initialization scheme. (Added 2011-07-21.)

Thanks to everyone who joined in on the fun!

Rapid DHCP: Or, how do Macs get on the network so fast? July 12, 2011

One of life's minor annoyances is having to wait on my devices to connect to the network after I wake them from sleep. All too often, I'll open the lid on my EeePC netbook, enter a web address, and get the dreaded "This webpage is not available" message because the machine is still working on connecting to my Wi-Fi network. On some occasions, I have to twiddle my thumbs for as long as 10-15 seconds before the network is ready to be used. The frustrating thing is that I know it doesn't have to be this way. I know this because I have a Mac. When I open the lid of my MacBook Pro, it connects to the network nearly instantaneously. In fact, no matter how fast I am, the network comes up before I can even try to load a web page. My curiosity got the better of me, and I set out to investigate how Macs are able to connect to the network so quickly, and how the network connect time in other operating systems could be improved.

I figure there are three main categories of time-consuming activities that occur during network initialization:

  1. Link establishment. This is the activity of establishing communication with the network's link layer. In the case of Wi-Fi, the radio must be powered on, the access point detected, and the optional encryption layer (e.g. WPA) established. After link establishment, the device is able to send and receive Ethernet frames on the network.
  2. Dynamic Host Configuration Protocol (DHCP). Through DHCP handshaking, the device negotiates an IP address for its use on the local IP network. A DHCP server is responsible for managing the IP addresses available for use on the network.
  3. Miscellaneous overhead. The operating system may perform any number of mundane tasks during the process of network initialization, including running scripts, looking up preconfigured network settings in a local database, launching programs, etc.

My investigation thus far is primarily concerned with the DHCP phase, although the other two categories would be interesting to study in the future. I set up a packet capture environment with a spare wireless access point, and observed the network activity of a number of devices as they initialized their network connection. For a worst-case scenario, let's look at the network activity captured while an Android tablet is connecting:

Samsung Galaxy Tab 10.1 - "dhcpcd-5.2.10:Linux-"
time (seconds) direction packet description
00.0000outLLC RNR (The link is now established.)
01.1300outDHCP request The client requests its IP address on the previously connected network.
05.6022outDHCP request The client again requests this IP address.
11.0984outDHCP discover: "Okay, I give up. Maybe this is a different network after all. Is there a DHCP server out there?"
11.7189inDHCP offer The server offers an IP address to the client.
11.7234outDHCP request The client accepts the offered IP address.
11.7514inDHCP ACK: The server acknowledges the client's acceptance of the IP address.

This tablet, presumably in the interest of "optimization", is initially skipping the DHCP discovery phase and immediately requesting its previous IP address. The only problem is this is a different network, so the DHCP server ignores these requests. After about 4.5 seconds, the tablet stubbornly tries again to request its old IP address. After another 4.5 seconds, it resigns itself to starting from scratch, and performs the DHCP discovery needed to obtain an IP address on the new network. The process took a whopping 11.8 seconds to complete. (Note: This would have been faster if my DHCP server was configured to send NAKs properly—see my update below... -simmons, 2011-07-21)

In all fairness, this delay wouldn't be so bad if the device was connecting to the same network as it was previously using. However, notice that the tablet waits a full 1.13 seconds after link establishment to even think about starting the DHCP process. Engineering snappiness usually means finding lots of small opportunities to save a few milliseconds here and there, and someone definitely dropped the ball here.

In contrast, let's look at the packet dump from the machine with the lightning-fast network initialization, and see if we can uncover the magic that is happening under the hood:

MacBook Pro - MacOS 10.6.8
time (seconds) direction packet description
00.0000outLLC RNR (The link is now established.)
00.0100outARP request broadcast who-has (The client is validating its link-local address)
00.0110outARP request unicast 00:22:75:45:e3:54 who-has tell
00.0120outARP request unicast 4e:80:98:f0:35:e3 who-has tell
00.0120inARP reply unicast from DHCP server: is-at 4e:80:98:f0:35:e3
00.0130outARP request unicast 00:0d:b9:54:27:b3 who-has tell
00.0140outDHCP request
00.0180outARP broadcast who-has tell
00.0210outARP broadcast who-has tell
00.0290outARP broadcast who-has tell
00.0290inARP reply unicast: is-at 4e:80:98:f0:35:e3
00.0310outUDP to router's port 192 (AirPort detection) This implies that the IP interface is now configured.
......(More normal IP activity on the newly configured interface)
01.2680outDHCP request
01.3043inDHCP ACK

The key to understanding the magic is the first three unicast ARP requests. It looks like Mac OS remembers certain information about not only the last connected network, but the last several networks. In particular, it must at least persist the following tuple for each of these networks:

  1. The Ethernet address of the DHCP server
  2. The IP address of the DHCP server
  3. Its own IP address, as assigned by the DHCP server

During network initialization, the Mac transmits carefully crafted unicast ARP requests with this stored information. For each network in its memory, it attempts to send a request to the specific Ethernet address of the DHCP server for that network, in which it asks about the server's IP address, and requests that the server reply to the IP address which the Mac was formerly using on that network. Unless network hosts have been radically shuffled around, at most only one of these ARP requests will result in a response—the request corresponding to the current network, if the current network happens to be one of the remembered networks.

This network recognition technique allows the Mac to very rapidly discover if it is connected to a known network. If the network is recognized (and presumably if the Mac knows that the DHCP lease is still active), it immediately and presumptuously configures its IP interface with the address it knows is good for this network. (Well, it does perform a self-ARP for good measure, but doesn't seem to wait more than 13ms for a response.) The DHCP handshaking process begins in the background by sending a DHCP request for its assumed IP address, but the network interface is available for use during the handshaking process. If the network was not recognized, I assume the Mac would know to begin the DHCP discovery phase, instead of sending blind requests for a former IP address as the Galaxy Tab does.

The Mac's rapid network initialization can be credited to more than just the network recognition scheme. Judging by the use of ARP (which can be problematic to deal with in user-space) and the unusually regular transmission intervals (a reliable 1.0ms delay between each packet sent), I'm guessing that the Mac's DHCP client system is entirely implemented as tight kernel-mode code. The Mac began the IP interface initialization process a mere 10ms after link establishment, which is far faster than any other device I tested. Android devices such as the Galaxy Tab rely on the user-mode dhclient system (part of the dhcpcd package) dhcpcd program, which no doubt brings a lot of additional overhead such as loading the program, context switching, and perhaps even running scripts.

The next step for some daring kernel hacker is to implement a similarly aggressive DHCP client system in the Linux kernel, so that I can enjoy fast sign-on speeds on my Android tablet, Android phone, and Ubuntu netbook. There already exists a minimal DHCP client implementation in the Linux kernel, but it lacks certain features such as configuring the DNS nameservers. Perhaps it wouldn't be too much work to extend this code to support network recognition and interface with a user-mode daemon to handle such auxillary configuration information received via DHCP. If I ever get a few spare cycles, maybe I'll even take a stab at it.

Update, July 12th, 2011 1pm MT:

This post has been mentioned on Hacker News, and there's lots of lively discussion in the comments over there.

Some people have pointed out some disadvantages in putting a full-featured DHCP client in the kernel. I'm skeptical about putting the DHCP client in the kernel, myself. However, I didn't want to elaborate on that at 2:00am, since the post was getting way too lengthy as it was. If I had known it would be subject to such peer review, I might have been a bit more careful with my words. :)

The argument for putting the DHCP client in the kernel basically boils down to:

  1. Achieving speed is all about shaving a few milliseconds here and there, and you just can't launch a program, wait for it to dynamically link, load config files, etc., and get the 10ms response time that the Mac has. (10ms from link establishment to transmitting the first DHCP packet.) I'm told that the dhcpcd program is a persistent daemon, so maybe the launch overhead isn't there. But something is keeping Linux hosts from having a 10ms response time.
  2. Doing ARP tricks could be awkward in user-space. You'd need to use the raw socket interface for transmitting (which isn't a big deal), and you'd have to use something like the packet(7) interface to sniff incoming packets to observe the ARP replies. I haven't played around with the packet(7) interface, so I'm not sure what the pros and cons might be.

Neither of these are show-stoppers to an improved user-mode DHCP client, but that was my thinking at the time. Now, I think I would certainly start with a user-mode solution, since a carefully crafted daemon should be able to achieve comparable response time, and the arping(8) program doesn't seem to have any problem using packet(7) to send and receive ARP packets in user-space.

Update, July 13th, 2011 2:48am MT:

Thanks to M. MacFaden for pointing out in the comments that this scheme is basically an implementation of RFC 4436: Detecting Network Attachment in IPv4 (DNAv4), which was co-authored by an Apple employee.

Update, July 21th, 2011 1:20pm MT:

Thanks to Steinar H. Gunderson for pointing out in the comments that the DHCP server on my test network was incorrectly configured. Since I was using a mostly "out of the box" dhcpd configuration from Ubunbtu Linux, it wasn't set up to be authoritative by default, so it wasn't promptly sending NAKs in response to the Galaxy Tab's requests for an old IP address. After fixing the problem on the DHCP server, the Galaxy Tab's DHCP handshake happens quite a bit faster (although still 85 times slower than the Mac). Below is the revised chart of network activity for the Galaxy Tab:

Samsung Galaxy Tab 10.1 (Revised) - "dhcpcd-5.2.10:Linux-"
time (seconds) direction packet description
00.0000outLLC RNR (The link is now established.)
01.1570outDHCP request The client requests its IP address on the previously connected network.
01.1574inDHCP NAK: The server declines to allow on this network.
02.2261outDHCP discover
02.5871inDHCP offer The server offers an IP address to the client.
02.5951outDHCP request The client accepts the offered IP address.
02.6198inDHCP ACK: The server acknowledges the client's acceptance of the IP address.

These times are more in line with what I see on most non-Mac devices on my non-test networks—about 2.5-3s in DHCP, plus a bit more time for link initialization and such—long enough that I frequently get a "no connection" error in my web browsers. We'll need to find ways to shave this down in emerging consumer electronics devices. Consumers are conditioned to think of PCs as "something you wait on," but expect non-PC network devices to behave more like light switches.

I've posted a summary of the discussion in another entry.

Finagling with Nagle: Nagle's algorithm and latency-sensitive applications July 8, 2011

NOTE: This post discusses a specific problem which is solved by disabling Nagle's algorithm. Do not try this at home: Nagle's algorithm is implemented in TCP/IP stacks for a reason, and I'm told that 99% of programmers who disable Nagle's algorithm do so errantly due to a lack of understanding of TCP. Disabling Nagle's algorithm is not a silver bullet that magically reduces latency.

While developing an input-only VNC client for Android to act as a remote mouse and keyboard, I noticed that mouse pointer movements were particularly jerky when connecting to a computer running TightVNC. Connections to other VNC servers yielded relatively smooth pointer movements, and using the official TightVNC client to connect to the TightVNC server was also smooth. My VNC client was not alone with this problem—connecting to the TightVNC server using other VNC clients such as Chicken of the VNC also resulted in severe pointer jerkiness. What was special about the VNC connection between the TightVNC client and the TightVNC server that made it work so much more smoothly?

To investigate, I used Wireshark to analyze the protocol communication between the client and server to see what was different. To my surprise, the RFB (VNC protocol) messages sent by the TightVNC client were basically the same as the messages sent by my client. However, there was one critical difference that existed deeper in the stack. In the TCP stream from the TightVNC client, each IP packet only contained a single RFB message. In the TCP stream from my client, several RFB messages would sometimes be batched together in a single IP packet. I immediately understood the problem.

The TCP/IP implementations of our operating systems contain a piece of code known as Nagle's Algorithm which improves performance by batching small writes into a single packet if it has not yet received an ACK for the previously sent packet. This algorithm works wonders for most applications, but very occasionally there is a need to disable Nagle's algorithm by setting the TCP_NODELAY socket option. An examination of the TightVNC client and server source code revealed that these programs do indeed disable Nagle's algorithm on each end of the TCP connection. I made a one-line change to my VNC client (socket.setTcpNoDelay(true);) and the problem went away—pointer movements became smooth as silk. The reason for the improvement is two-fold:

  1. The TightVNC server does not animate intermediate pointer events when they are received all at once. In my brief perusal, I couldn't find a specific location in the source code where this was happening, so it may be an efficiency hack in the operating system. (I don't know exactly how the TightVNC sends its synthetic events to Windows, but maybe it needs to do some sort of flush after each event.) By sending the messages in individual IP packets, each pointer event is processed and appears on the screen as a smooth transit.
  2. My use of VNC as an input-only remote control system is very latency-sensitive. The user's eyeballs are watching the remote host's screen while their thumb is moving across their phone, generating pointer events. There is an inevitable delay between the time the user's thumb moves and the time their eyeballs see the pointer move. The shorter the delay, the more natural their phone feels as an input device. The longer the delay, the more painful the experience. In this specific scenario, I judged that snappiness was more important than overall bandwidth, so I disabled Nagle's algorithm. In addition to fixing the jerkiness problem with TightVNC, to my delight this also improved smoothness to some degree with all other VNC server implementations!

This was one of the rare cases were it was appropriate to disable Nagle's algorithm. In fact, this is the only time in my 25+ years of programming that I have ever found a need to disable Nagle's algorithm. My application is only likely to be used on a local network, otherwise the disabling could cause more harm than good. If you set TCP_NODELAY on your sockets, be warned that the responsibility for being efficient with packets now falls into your hands—the operating system is not going to help out. In my case, I found several places in my code where I needed to combine multi-message transmissions into a single send() in order to fix performance lost due to disabling Nagle's algorithm.

Android Network Information June 5, 2011

While developing Android applications, I'm often juggling lots of Android machines, both real and virtual. Since I often need to connect to these machines over the network with adb connect, I found it useful and educational to write a small home screen widget that always shows the device's IP address. This is a pretty dumb application, but I decided that it would be a good opportunity to learn how to publish apps on the Android Market.


The main screen lists all network interfaces and enumerates their IPv4/IPv6 addresses.
This home screen widget always shows your current IP address.
Introducing Valence May 19, 2011
An on-screen trackpad and keyboard allow a computer to be remote controlled.
Valence supports mDNS service discovery (aka Bonjour or Avahi) to locate participating VNC servers on the local network.

In my spare cycles recently, I've been tinkering with developing Android code to control home theater components. As someone who has been passionately involved with the consumer electronics industry over the past ten years, my personal home theater system includes quite a collection of disparate devices that defy even the most feature-rich universal remote controls. What the world needs is an open, extensible Android remote control application that can be rapidly updated to support new devices as they are released.

The first module of code I've developed is a remote control for home theater computers, such as the one I use to watch YouTube videos and other web video content. I've built a standalone app with this code for testing purposes. There are plenty of other such apps on the Android Market, but my program, Valence, seeks to provide a simpler experience in the following ways:

  • Most computer remote control apps require that the controlled computer be running special software, unique to the app, that relays mouse and keyboard events to the operating system. Not only does this add yet another single-purpose ever-running program to your computer, but the author may not have support for your operating system yet. To solve this problem, Valence uses the industry standard VNC system and its RFB (remote framebuffer) protocol. Many operating systems come with VNC built-in, and many people may already have VNC installed and enabled. Instead of using VNC to see the screen of a remote computer, Valence uses VNC in a strictly one-way fashion—input events are transmitted from the Android handset to the VNC server, but video frames are never sent from the server to the handset.
  • Valence supports mDNS service discovery to automatically find VNC servers on your local network, without the need to mess around with IP addresses. Not all VNC servers support discovery, but it sure saves some hassle if yours does. (A few Android phones do have trouble performing such discovery, but most seem to work.)

Today I'm releasing an unpolished "rough draft" of Valence so my friends and colleagues can test its compatibility with their networks. Some known bugs remain in this version, but I'm erring on the side of getting feedback early. Eventually, I'll roll the Valence functionality into my larger remote control framework that provides many more features such as controlling IR devices.

To use Valence, make sure you have VNC server software installed on your computer, and configured to allow connections:

  • Mac OS. Macs have built-in VNC software, so no additional software needs to be installed. In "System Preferences," go to the "Sharing" tab and make sure "Remote Management" is selected. Then click "Computer Settings..." and make sure "VNC viewers may control screen with password" is selected, and provide a password.
  • Ubuntu Linux. Ubuntu Linux has built-in VNC software, so no additional software needs to be installed. In the System menu, select Preferences then Remote Desktop. Check "Allow other users to view your desktop" and "Allow other users to control your desktop," uncheck "You must confirm each access to this machine," and check "Require the user to enter this password." Assign a password.
  • Windows. Windows does not have a built-in VNC server, so you'll need to install one such as TightVNC. Unfortunately, TightVNC seems to be a little jerky with processing mouse movement, so please let me know if you find a better VNC server for Windows that works with recent releases such as Windows 7. On TightVNC's download page, click the "download" link next to "Self-installing package for Windows." Open the downloaded package, click "Run," and proceed through the setup wizard. (Just keep hitting "Next" and "I agree".) Enter a password when prompted in the "Service Configuration" dialog, click "Install," and complete the installation.

Update: June 7, 2011

I've uploaded a new beta package, which is linked below. Changes include the following:

  • Bug fix: Valence did not properly store passwords that were less than eight characters. This has been fixed.
  • Mouse movement should be much smoother now. In particular, usability with the TightVNC server has been vastly improved. (I'm now setting TCP_NODELAY on the socket to disable Nagle's algorithm.)
  • Reorientation (from landscape to portrait and vice-versa) no longer causes the connection to be killed and re-established.
  • An "about" page and help documentation are now available by pressing the menu button.
  • Bug fix: If the user pressed the back button while Valence was trying to connect, the app would crash. This has been fixed.
  • I tweaked the launcher icon slightly.

Update: June 30, 2011

Yet another beta release. Changes include the following:

  • Keyboard
    • Added a new button for special keys like "esc", "tab", F1-F10, etc.
    • Better support for the HTC soft keyboard.
    • Better support for international characters.
  • Documentation
    • Added a section on security considerations to the online help.
    • The "about" window now contains the build date of the package.
    • The "about" window is now scrollable, so it works in landscape.
  • VNC setup
    • In the setup form, alternate port numbers (other than 5900) were not being saved. Fixed.
    • The setup form would reset whenever the device changed orientation. Fixed.
    • Your configured VNC servers can now be edited. Long press on the server, and select "edit."
    • You can now change the displayed name of a VNC server, instead of being forced to see the name that was auto-detected from the server.
    • The setup form is now scrollable, in case it doesn't fit on your screen.
  • Trackpad
    • The trackpad code has been completely rewritten to pave the way for new features. This is still a complex piece of logic, and I wouldn't be surprised to see some regressions resulting from the rewrite.
    • The highly anticipated scrolling feature has been added. Swipe up and down with two fingers to scroll. Horizontal scrolling is not yet supported. Different operating systems (and perhaps different applications within an operating system) have different ideas about what a suitable unit of vertical scrolling is, so at present scrolling is a bit too fast on Windows and a bit too slow on Mac.
    • You can now simulate a right mouse button press by tapping with two fingers.
  • Miscellaneous bug fixes
    • The TCP socket did not always shut down when the Valence activity finished. Fixed.
    • Valence would crash when the screen was reoriented while an alert was being shown. Fixed.

Update: July 18th, 2011

I've made a beta release of Valence available on the Android Market. There have been no significant changes since the previous release, other than removing a bit of debug logging.

Update: August 27th, 2011

I posted a new release to the Market with an important bug fix, and a few minor changes:

  • Users reported a bug that was preventing some handsets from sending right-clicks by tapping with two fingers. This bug has been fixed.
  • I updated the touchpad text and the help file to reveal the ability to right-click and scroll by using two fingers.
  • I dropped "Beta" from the app's name.


File event notifications in Mac OS April 7, 2011

While using Mac OS, I've been missing the handy Linux inotifywait utility—it's a simple program to use the Linux inotify facility to wait on certain file events. I sometimes write scripts that use inotifywait to automatically launch programs when files are changed. For instance, I can have a script automatically compile a program whenever I save the source file in the editor.

It turns out that Mac OS and other recent BSD operating systems have a similar kernel facility called kqueue, and it was really easy to whip up a small program to block until an event occurs on a file. My filewait program is linked below, and can be used in scripts such as this one:

# wait for the file to change...
while filewait magnumopus.tex; do
    # compile the file
    sleep 0.2
    pdflatex magnumopus.tex


  • filewait.c - My simple program to pause until a file event occurs.
Implementing DES April 6, 2011

DES, the Data Encryption Standard, was developed by IBM and the US government in the 1970's. Today, DES is considered to be weak and crackable, and a poor choice for anyone in the market for an encryption algorithm. However, many legacy protocols still use DES, so it's important to have implementations handy.

I recently found myself looking for a simple standalone DES implementation to study. Most of the open-source DES implementations are either highly optimized into obfuscation, or sloppily written. Either way, it's hard to find a clearly written and well-commented implementation suitable for educational purposes. I decided it would be a good exercise to write one myself. I wrote the implementation in Java, for the extra challenge of performing bit manipulations on signed primitive types. This implementation is undoubtedly very inefficient, but is well-commented and should be easy to understand for anyone who wants to dive into a sea of Feistel functions, S-boxes, variable rotations, and permutations.


I found the following resources useful in my study of DES:


  • - My DES implementation in Java