Caffeinated Bitstream

Bits, bytes, and words.

Apple Remote Desktop quirks

While developing Valence, an input-only Android VNC client for remote controlling a computer, I've encountered several notable quirks in Apple Remote Desktop, Mac OS's built-in VNC server. Apple Remote Desktop (ARD) is based on VNC, a system developed in the late 1990's for controlling a remote computer, and its Remote Framebuffer (RFB) protocol. Generally, standard VNC clients can interoperate with ARD. An ARD server reports use of version "3.889" of the RFB protocol, which isn't a real version of RFB, but this version number can be used by clients to know that they are talking to an ARD server and not a conventional VNC server.

ARD authentication

During the RFB handshaking, a VNC server will announce which authentication methods it supports, and the client then picks from among these. Most VNC servers support a scheme known as "VNC authentication," which is a simple DES-based password challenge/response system. ARD offers VNC authentication, but it also offers a proprietary scheme which allows the user to also supply a username. This scheme is known as "Mac authentication" or "ARD authentication." Mac OS X 10.7 Lion includes an important change to the way ARD works: If you connect with VNC authentication, you are presented with a login screen which prompts you for a username and password before allowing you to control the desktop. If you connect with ARD authentication, this login screen is bypassed and you can immediately control the desktop.

Because Valence is an input-only VNC client, the login screen is a show-stopper: since the login screen cannot be seen, the user cannot login and control the desktop. The only way for Valence to support Lion's built-in VNC server was to add the ARD authentication scheme as an option. Apple has a support article which gives a high-level overview of the ARD authentication process, but it unfortunately does not contain the technical detail needed to implement the scheme. I was able to discover the technical details by studying the gtk-vnc open-source library, which implements ARD authentication thanks to a patch provided by Håkon Enger last year. (I'm not sure how Mr. Enger figured out the technique, but I'm grateful.)

The basic steps for performing ARD authentication are as follows:

  1. Read the authentication material from the socket. A two-byte generator value, a two-byte key length value, the prime modulus (keyLength bytes), and the peer's generated public key (keyLength bytes).
  2. Generate your own Diffie-Hellman public-private key pair.
  3. Perform Diffie-Hellman key agreement, using the generator (g), prime (p), and the peer's public key. The output will be a shared secret known to both you and the peer.
  4. Perform an MD5 hash of the shared secret. This 128-bit (16-byte) value will be used as the AES key.
  5. Pack the username and password into a 128-byte plaintext "credentials" structure: { username[64], password[64] }. Null-terminate each. Fill the unused bytes with random characters so that the encryption output is less predictable.
  6. Encrypt the plaintext credentials with the 128-bit MD5 hash from step 4, using the AES 128-bit symmetric cipher in electronic codebook (ECB) mode. Use no further padding for this block cipher.
  7. Write the ciphertext from step 6 to the stream. Write your generated DH public key to the stream.
  8. Check for authentication pass/fail as usual.

For further reference, my Java implementation of ARD authentication is available on GitHub.

Right-click problems in the Snow Leopard ARD v3.5 update

In July 2011, Apple pushed an update to Snow Leopard users which included a newer version of Apple Remote Desktop, the built-in VNC server. This new version, v3.5, interprets mouse buttons differently—the RFB "button-2" event is now used to indicate a right-click instead of the "button-3" event which is used on standard VNC implementations. This broke Valence's support for sending a right-click when the user performs a two-finger tap. To restore right-click support, I had to add a "send mouse button-2 instead of button-3" option to Valence's server configuration.

It is unclear to me why ARD v3.5 does this in Snow Leopard, why this problem doesn't exist in Lion, and why this problem doesn't appear when using the ARD client.

Foreign keyboard key mapping

Valence can be used to send international keys to VNC servers running on Linux or Windows with no problem. However, I was surprised to discover that these key events were often not correctly consumed by Macs. Upon investigation, I learned that the problem was worse—using a foreign keyboard layout can cause the wrong keys to be consumed by the Mac. The root cause appears to be the lack of support in Mac OS for allowing applications (such as a VNC server) to inject synthetic keysyms (symbolic representations of keys). Only injection of physical keycodes (numeric representations of actual keys on a keyboard) is supported. (Note that I'm borrowing "keysym" and "keycode" from X11 terminology, but most systems have equivalent concepts.)

Operating systems generally provide layers of abstraction around keyboard input, and some background is required to understand the problem. Keyboard hardware sends a "scancode" to the computer for each key pressed, using a scheme that dates back to the electrical configuration of rows and columns in early keyboards. These scancodes are then translated into more useful values one or more times before being provided to an application. On a Linux/X11 system, for instance, the scancodes are converted to "keycodes" which are simple representations of each physical key on the keyboard. The scancode to keycode mapping allows computers to use keyboards with different scancode conventions. For example, the 7th key from the left on the top letter row may be represented by different scancode bytes on different keyboards, but the operating system will always translate that key to the same keycode. The keycode, which still represents a physical key, is then mapped into a virtual value known as a keysym, based on the configured keyboard layout. For example, that 7th key on the top letter row would result in a keysym for "Y" when using a U.S. keyboard layout, but it would be translated into a "Z" when using a German keyboard layout. These abstraction layers allow applications to be written without needing to consider the keyboard electronics or the arrangement of keys on the keyboard. When the application receives a "Y" keysym, it knows that the user intended to enter "Y" regardless of which physical key they pressed to make it happen.

VNC clients send symbolic key representations, to avoid any of the headache with keyboard layouts. (The symbolic key representations used in the RFB protocol are defined to be identical to the X11 keysyms, but are trivially translatable into symbolic key values for other systems.) When a user presses a "Y", the VNC client sends 0x0079, which is the keysym for "Y", regardless of which physical key was pressed. The VNC server then injects a synthetic key event with the appropriate symbolic key for "Y" into the input system of the server. A user using a VNC client with a U.S. keyboard and a user using a VNC client with a German keyboard would both be able to type on a server with any defined keyboard layout without noticing any problems, because the VNC server injects the symbolic key and bypasses the whole messy business of keyboard layouts.

That's the theory, anyway. Unfortunately, since Mac OS doesn't seem to support symbolic injection, only physical keycode injection, the whole system falls apart. Here's what happens when an American user presses the 7th key of the top letter row ("Y" on the U.S. keyboard) while using a VNC client to control a server configured with a German keyboard:

  • Linux and Windows. The VNC client sends 0x0079, the VNC symbolic key representation for the letter "Y". The VNC server receives this event, and says, "I see you want to press the letter Y on this computer. I'll send the symbolic key for "Y" directly to the running application. We won't worry about what kind of keyboard is physically attached to the server... it doesn't matter!" The application receives the keysym for "Y", and all is well.
  • Mac OS. The VNC client sends 0x0079, the VNC symbolic key representation for the letter "Y". The VNC server receives this event, and says, "Gosh, the operating system's event API only lets me send physical key events, so I'll have to translate this into a physical keycode. I wonder how I can do that... Oh, I have a built-in list of how keysyms translate to keycodes for American keyboards, I'll use that! It looks like I should send the physical keycode representing the 7th key of the top letter row. Done!" After the VNC server injects the keycode, the operating system then performs its task of translating the physical keycode back into a keysym to be delivered to the application. Unfortunately, since the operating system is configured to be using a German keyboard layout, the keycode is translated into the keysym for "Z" instead of "Y". The user is shocked to see a "z" appear on the screen after he typed a "y"!

A sufficiently smart VNC server perhaps could try to work around this problem by (somehow) being aware of which keyboard layout is in use on the server, and having the information needed to translate keysyms to keycodes for all possible layouts. This may not even be feasible, but even if it is, it doesn't solve the problem: the remote user would still be unable to type characters that are not physically present on the server's keyboard.

A brief survey of open-source software that deals with key injection on Mac OS shows that my Valence app isn't alone in suffering from this issue—they all do. In fact, even Apple Remote Desktop cannot handle this correctly! The usual advice for using an ARD client to control an ARD server with a different keyboard mapping is to change the keyboard mapping on either the client or the server.

I have no solution for the foreign key problem at this time.