Caffeinated Bitstream

Bits, bytes, and words.


1 2 3 4 5 Next page »
Implementing SCTP to support WebRTC data channels in pure Rust

I recently talked at the Denver Rust Meetup about my side project to implement WebRTC data channels in Rust. I'll probably blog about this project in more detail in the future, but in the meantime here are the slides and a short demo video.

I also made the following informal "studio" video of the talk. The live version was probably better, but hopefully this is good enough to get the points across.

Tokio internals: Understanding Rust's asynchronous I/O framework from the bottom up

Tokio is a Rust framework for developing applications which perform asynchronous I/O — an event-driven approach that can often achieve better scalability, performance, and resource usage than conventional synchronous I/O. Unfortunately, Tokio is notoriously difficult to learn due to its sophisticated abstractions. Even after reading the tutorials, I didn't feel that I had internalized the abstractions sufficiently to be able to reason about what was actually happening.

My prior experience with asynchronous I/O programming may have even hindered my Tokio education. I'm accustomed to using the operating system's selection facility (e.g. Linux epoll) as a starting point, and then moving on to dispatch, state machines, and so forth. Starting with the Tokio abstractions with no clear insight into where and how the underlying epoll_wait() happens, I found it difficult to connect all the dots. Tokio and its future-driven approach felt like something of a black box.

Instead of continuing on a top-down approach to learning Tokio, I decided to instead take a bottom-up approach by studying the source code to understand exactly how the current concrete implementation drives the progression from epoll events to I/O consumption within a Future::poll(). I won't go into great detail about the high-level usage of Tokio and futures, as that is better covered in the existing tutorials. I'm also not going to discuss the general problem of asynchronous I/O beyond a short summary, since entire books could be written on the subject. My goal is simply to have some confidence that futures and Tokio's polling work the way I expect.

First, some important disclaimers. Note that Tokio is actively being developed, so some of the observations here may quickly become out-of-date. For the purposes of this study, I used tokio-core 0.1.10, futures 0.1.17, and mio 0.6.10. Since I wanted to understand Tokio at its lowest levels, I did not consider higher-level crates like tokio-proto and tokio-service. The tokio-core event system itself has a lot of moving pieces, most of which I avoid discussing in the interest of brevity. I studied Tokio on a Linux system, and some of the discussion necessarily touches on platform-dependent implementation details such as epoll. Finally, everything mentioned here is my interpretation as a newcomer to Tokio, so there could be errors or misunderstandings.

Asynchronous I/O in a nutshell

Synchronous I/O programming involves performing I/O operations which block until completion. Reads will block until data arrives, and writes will block until the outgoing bytes can be delivered to the kernel. This fits nicely with conventional imperative programming, where a series of steps are executed one after the other. For example, consider an HTTP server that spawns a new thread for each connection. On this thread, it may read bytes until an entire request is received (blocking as needed until all bytes arrive), processes the request, and then write the response (blocking as needed until all bytes are written). This is a very straightforward approach. The downside is that a distinct thread is needed for each connection due to the blocking, each with its own stack. In many cases this is fine, and synchronous I/O is the correct approach. However, the thread overhead hinders scalability on servers trying to handle a very large number of connections (see: the C10k problem), and may also be excessive on low-resource systems handling a few connections.

If our HTTP server was written to use asynchronous I/O, on the other hand, it might perform all I/O processing on a single thread. All active connections and the listening socket would be configured as non-blocking, monitored for read/write readiness in an event loop, and execution would be dispatched to handlers as events occur. State and buffers would need to be maintained for each connection. If a handler is only able to read 100 bytes of a 200-byte request, it cannot wait for the remaining bytes to arrive, since doing so would prevent other connections from making progress. It must instead store the partial read in a buffer, keep the state set to "reading request", and return to the event loop. The next time the handler is called for this connection, it may read the remainder of the request and transition to a "writing response" state. Implementing such a system can become hairy very fast, with complex state machines and error-prone resource management.

The ideal asynchronous I/O framework would provide a means of writing such I/O processing steps one after the other, as if they were blocking, but behind the scenes generate an event loop and state machines. That's a tough goal in most languages, but Tokio brings us pretty close.

The Tokio stack

The Tokio stack consists of the following components:

  1. The system selector. Each operating system provides a facility for receiving I/O events, such as epoll (Linux), kqueue() (FreeBSD/Mac OS), or IOCP (Windows).
  2. Mio - Metal I/O. Mio is a Rust crate that provides a common API for low-level I/O by internally handling the specific details for each operating system. Mio deals with the specifics of each operating system's selector so you don't have to.
  3. Futures. Futures provide a powerful abstraction for representing things that have yet to happen. These representations can be combined in useful ways to create composite futures describing a complex sequence of events. This abstraction is general enough to be used for many things besides I/O, but in Tokio we develop our asynchronous I/O state machines as futures.
  4. Tokio The tokio-core crate provides the central event loop which integrates with Mio to respond to I/O events, and drives futures to completion.
  5. Your program. A program using the Tokio framework can construct asynchronous I/O systems as futures, and provide them to the Tokio event loop for execution.

Mio: Metal I/O

Mio provides a low-level I/O API allowing callers to receive events such as socket read/write readiness changes. The highlights are:

  1. Poll and Evented. Mio supplies the Evented trait to represent anything that can be a source of events. In your event loop, you register a number of Evented's with a mio::Poll object, then call mio::Poll::poll() to block until events have occurred on one or more Evented objects (or the specified timeout has elapsed).
  2. System selector. Mio provides cross-platform access to the system selector, so that Linux epoll, Windows IOCP, FreeBSD/Mac OS kqueue(), and potentially others can all be used with the same API. The overhead required to adapt the system selector to the Mio API varies. Because Mio provides a readiness-based API similar to Linux epoll, many parts of the API can be one-to-one mappings when using Mio on Linux. (For example, mio::Events essentially is an array of struct epoll_event.) In contrast, because Windows IOCP is completion-based instead of readiness-based, a bit more adaptation is required to bridge the two paradigms. Mio supplies its own versions of std::net structs such as TcpListener, TcpStream, and UdpSocket. These wrap the std::net versions, but default to non-blocking and provide Evented implementations which add the socket to the system selector.
  3. Non-system events. In addition to providing readiness of I/O sources, Mio can also indicate readiness events generated in user-space. For example, if a worker thread finishes a unit of work, it can signal completion to the event loop thread. Your program calls Registration::new2() to obtain a (Registration, SetReadiness) pair. The Registration object is an Evented which can be registered with Mio in your event loop, and set_readiness() can be called on the SetReadiness object whenever readiness needs to be indicated. On Linux, non-system event notifications are implemented using a pipe. When SetReadiness::set_readiness() is called, a 0x01 byte is written to the pipe. mio::Poll's underlying epoll is configured to monitor the reading end of the pipe, so epoll_wait() will unblock and Mio can deliver the event to the caller. Exactly one pipe is created when Poll is instantiated, regardless of how many (if any) non-system events are later registered.

Every Evented registration is associated with a caller-provided usize value typed as mio::Token, and this value is returned with events to indicate the corresponding registration. This maps nicely to the system selector in the Linux case, since the token can be placed in the 64-bit epoll_data union which functions in the same way.

To provide a concrete example of Mio operation, here's what happens internally when we use Mio to monitor a UDP socket on a Linux system:

  1. Create the socket.
    let socket = mio::net::UdpSocket::bind(

    This creates a Linux UDP socket, wrapped in a std::net::UdpSocket, which itself is wrapped in a mio::net::UdpSocket. The socket is set to be non-blocking.

  2. Create the poll.
    let poll = mio::Poll::new().unwrap();

    Mio initializes the system selector, readiness queue (for non-system events), and concurrency protection. The readiness queue initialization creates a pipe so readiness can be signaled from user-space, and the pipe's read file descriptor is added to the epoll. When a Poll object is created, it is assigned a unique selector_id from an incrementing counter.

  3. Register the socket with the poll.

    The UdpSocket's Evented.register() function is called, which proxies to a contained EventedFd which adds the socket's file descriptor to the poll selector (by ultimately using epoll_ctl(fepd, EPOLL_CTL_ADD, fd, &epoll_event) where is set to the provided token value). When a UdpSocket is registered, its selector_id is set to the Poll's, thus associating it with the selector.

  4. Call poll() in an event loop.
    loop {
        poll.poll(&mut events, None).unwrap();
        for event in &events {

    The system selector (epoll_wait()) and then the readiness queue are polled for new events. (The epoll_wait() blocks, but because non-system events trigger epoll via the pipe in addition to pushing to the readiness queue, they will still be processed in a timely manner.) The combined set of events are made available to the caller for processing.

Futures and Tasks

Futures are techniques borrowed from functional programming whereby computation that has yet to happen can be represented as a "future", and these individual futures can be combined to develop complex systems. This is useful for asynchronous I/O because the basic steps needed to perform transactions can be modeled as such combined futures. In the HTTP server example, one future may read a request by reading bytes as they become available until the end of the request is reached, at which time a "Request" object is yielded. Another future may process a request and yield a response, and yet another future may write responses.

In Rust, futures are implemented in the futures crate. You can define a future by implementing the Future trait, which requires a poll() method which is called as needed to allow the future to make progress. This method returns either an error, an indication that the future is still pending thus poll() should be called again later, or a yielded value if the future has reached completion. The Future trait also provides a great many combinators as default methods.

To understand futures, it is crucial to understand tasks, executors, and notifications — and how they arrange for a future's poll() method to be called at the right time. Every future is executed within a task context. A task itself is directly associated with exactly one future, but this future may be a composite future that drives many contained futures. (For example, multiple futures joined into a single future using the join_all() combinator, or two futures executed in series using the and_then() combinator.)

Tasks and their futures require an executor to run. An executor is responsible for polling the task/future at the correct times — usually when it has been notified that progress can be made. Such a notification happens when some other code calls the notify() method of the provided object implementing the futures::executor::Notify trait. An example of this can be seen in the extremely simple executor provided by the futures crate that is invoked when calling the wait() method on a future. From the source code:

/// Waits for the internal future to complete, blocking this thread's
/// execution until it does.
/// This function will call `poll_future` in a loop, waiting for the future
/// to complete. When a future cannot make progress it will use
/// `thread::park` to block the current thread.
pub fn wait_future(&mut self) -> Result<F::Item, F::Error> {
    ThreadNotify::with_current(|notify| {

        loop {
            match self.poll_future_notify(notify, 0)? {
                Async::NotReady => notify.park(),
                Async::Ready(e) => return Ok(e),

Given a futures::executor::Spawn object previously created to fuse a task and future, this executor calls poll_future_notify() in a loop. The provided Notify object becomes part of the task context and the future is polled. If a future's poll() returns Async::NotReady indicating that the future is still pending, it must arrange to be polled again in the future. It can obtain a handle to its task via futures::task::current() and call the notify() method whenever the future can again make progress. (Whenever a future is being polled, information about its associated task is stored in a thread-local which can be accessed via current().) In the above case, if the poll returns Async::NotReady, the executor will block until the notification is received. Perhaps the future starts some work on another thread which will call notify() upon completion, or perhaps the poll() itself calls notify() directly before returning Async::NotReady. (The latter is not common, since theoretically a poll() should continue making progress, if possible, before returning.)

The Tokio event loop acts as a much more sophisticated executor that integrates with Mio events to drive futures to completion. In this case, a Mio event indicating socket readiness will result in a notification that causes the corresponding future to be polled.

Tasks are the basic unit of execution when dealing with futures, and are essentially green threads providing a sort of cooperative multitasking, allowing multiple execution contexts on one operating system thread. When one task is unable to make progress, it will yield the processor to other runnable tasks. It is important to understand that notifications happen at the task level and not the future level. When a task is notified, it will poll its top-level future, which may result in any or all of the child futures (if present) being polled. For example, if a task's top-level future is a join_all() of ten other futures, and one of these futures arranges for the task to be notified, all ten futures will be polled whether they need it or not.

Tokio's interface with Mio

Tokio converts task notifications into Mio events by using Mio's "non-system events" feature described above. After obtaining a Mio (Registration, SetReadiness) pair for the task, it registers the Registration (which is an Evented) with Mio's poll, then wraps the SetReadiness object in a MySetReadiness which implements the Notify trait. From the source code:

struct MySetReadiness(mio::SetReadiness);

impl Notify for MySetReadiness {
    fn notify(&self, _id: usize) {
              .expect("failed to set readiness");

In this way, task notifications are converted into Mio events, and can be processed in Tokio's event handling and dispatch code along with other types of Mio events.

Just as Mio wraps std::net structs such as UdpSocket, TcpListener, and TcpStream to customize functionality, Tokio also uses composition and decoration to provide Tokio-aware versions of these types. For example, Tokio's UdpSocket looks something like this:

Tokio's versions of these I/O source types provide constructors that require a handle to the event loop (tokio_core::reactor::Handle). When instantiated, these types will register their sockets with the event loop's Mio poll to receive edge-triggered events with a newly assigned even-numbered token. (More on this, below.) Conveniently, these types will also arrange for the current task to be notified of read/write readiness whenever the underlying I/O operation returns WouldBlock.

Tokio registers several types of Evented's with Mio, keyed to specific tokens:

  • Token 0 (TOKEN_MESSAGES) is used for Tokio's internal message queue, which provides a means of removing I/O sources, scheduling tasks to receive read/write readiness notifications, configuring timeouts, and running arbitrary closures in the context of the event loop. This can be used to safely communicate with the event loop from other threads. For example, Remote::spawn() marshals the future to the event loop via the message system.

    The message queue is implemented as a futures::sync::mpsc stream. As a futures::stream::Stream (which is similar to a future, except it yields a sequence of values instead of a single value), the processing of this message queue is performed using the MySetReadiness scheme mentioned above, where the Registration is registered with the TOKEN_MESSAGES token. When TOKEN_MESSAGES events are received, they are dispatched to the consume_queue() method for processing. (Source: enum Message, consume_queue())

  • Token 1 (TOKEN_FUTURE) is used to notify Tokio that the main task needs to be polled. This happens when a notification occurs which is associated with the main task. (In other words, the future passed to Core::run() or a child thereof, not a future running in a different task via spawn().) This also uses a MySetReadiness scheme to translate future notifications into Mio events. Before a future running in the main task returns Async::NotReady, it will arrange for a notification to be sent later in a manner of its choosing. When the resulting TOKEN_FUTURE event is received, the Tokio event loop will re-poll the main task.

  • Even-numbered tokens greater than 1 (TOKEN_START+key*2) are used to indicate readiness changes on I/O sources. The key is the Slab key for the associated Core::inner::io_dispatch Slab<ScheduledIo> element. The Mio I/O source types (UdpSocket, TcpListener, and TcpStream) are registered with such a token automatically when the corresponding Tokio I/O source types are instantiated.

  • Odd-numbered tokens greater than 1 (TOKEN_START+key*2+1) are used to indicate that a spawned task (and thus its associated future) should be polled. The key is the Slab key for the associated Core::inner::task_dispatch Slab<ScheduledTask> element. As with TOKEN_MESSAGES and TOKEN_FUTURE events, these also use the MySetReadiness plumbing.

Tokio event loop

Tokio, specifically tokio_core::reactor::Core, provides the event loop to manage futures and tasks, drive futures to completion, and interface with Mio so that I/O events will result in the correct tasks being notified. Using the event loop involves instantiating the Core with Core::new() and calling Core::run() with a single future. The event loop will drive the provided future to completion before returning. For server applications, this future is likely to be long-lived. It may, for example, use a TcpListener to continuously accept new incoming connections, each of which may be handled by their own future running independently in a separate task created by Handle.spawn().

The following flow chart outlines the basic steps of the Tokio event loop:

What happens when data arrives on a socket?

A useful exercise for understanding Tokio is to examine the steps that occur within the event loop when data arrives on a socket. I was surprised to discover that this ends up being a two-part process, with each part requiring a separate epoll transaction in a separate iteration of the event loop. The first part responds to a socket becoming read-ready (i.e., a Mio event with an even-numbered token greater than one for spawned tasks, or TOKEN_FUTURE for the main task) by sending a notification to the task which is interested in the socket. The second part handles the notification (i.e., a Mio event with an odd-numbered token greater than one) by polling the task and its associated future. We'll consider the steps in a scenario where a spawned future is reading from a UdpSocket on a Linux system, from the top of the Tokio event loop, assuming that a previous poll of the future resulted in a recv_from() returning a WouldBlock error.

The Tokio event loop calls mio::Poll::poll(), which in turn (on Linux) calls epoll_wait(), which blocks until some readiness change event occurs on one of the monitored file descriptors. When this happens, epoll_wait() returns an array of epoll_event structs describing what has occurred, which are translated by Mio into mio::Events and returned to Tokio. (On Linux, this translation should be zero-cost, since mio::Events is just a single-tuple struct of a epoll_event array.) In our case, assume the only event in the array is indicating read readiness on the socket. Because the event token is even and greater than one, Tokio interprets this as an I/O event, and looks up the details in the corresponding element of Slab<ScheduledIo>, which contains information on any tasks interested in read and write readiness for this socket. Tokio then notifies the reader task which, by way of the MySetReadiness glue described earlier, calls Mio's set_readiness(). Mio handles this non-system event by adding the event details to its readiness queue, and writing a single 0x01 byte to the readiness pipe.

After the Tokio event loop moves to the next iteration, it once again polls Mio, which calls epoll_wait(), which this time returns a read readiness event occurring on Mio's readiness pipe. Mio reads the 0x01 which was previously written to the pipe, dequeues the non-system event details from the readiness queue, and returns the event to Tokio. Because the event token is odd and greater than one, Tokio interprets this as a task notification event, and looks up the details in the corresponding element of Slab<ScheduledTask>, which contains the task's original Spawn object returned from spawn(). Tokio polls the task and its future via poll_future_notify(). The future may then read data from the socket until it gets a WouldBlock error.

This two-iteration approach involving a pipe write and read may add a little overhead when compared to other asynchronous I/O event loops. In a single-threaded program, it is weird to look at the strace and see a thread use a pipe to communicate with itself:

 1 2 3 4 5 6 7 8 91011
pipe2([4, 5], O_NONBLOCK|O_CLOEXEC) = 0
epoll_wait(3, [{EPOLLIN|EPOLLOUT, {u32=14, u64=14}}], 1024, -1) = 1
write(5, "\1", 1) = 1
epoll_wait(3, [{EPOLLIN, {u32=4294967295, u64=18446744073709551615}}], 1024, 0) = 1
read(4, "\1", 128) = 1
read(4, 0x7ffce1140f58, 128) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(12, "hello\n", 1024, 0, {sa_family=AF_INET, sin_port=htons(43106), sin_addr=inet_addr("")}, [16]) = 6
recvfrom(12, 0x7f576621c800, 1024, 0, 0x7ffce1140070, 0x7ffce114011c) = -1 EAGAIN (Resource temporarily unavailable)
epoll_wait(3, [], 1024, 0) = 0
epoll_wait(3, 0x7f5765b24000, 1024, -1) = -1 EINTR (Interrupted system call)

Mio uses this pipe scheme to support the general case where set_readiness() may be called from other threads, and perhaps it also has some benefits in forcing the fair scheduling of events and maintaining a layer of indirection between futures and I/O.

Lessons learned: Combining futures vs. spawning futures

When I first started exploring Tokio, I wrote a small program to listen for incoming data on several different UDP sockets. I created ten instances of a socket-reading future, each of them listening on a different port number. I naively joined them all into a single future with join_all(), passed the combined future to Core::run(), and was surprised to discover that every future was being polled whenever a single packet arrived. Also somewhat surprising was that tokio_core::net::UdpSocket::recv_from() (and its underlying PollEvented) was smart enough to avoid actually calling the operating system's recvfrom() on sockets that had not been flagged as read-ready in a prior Mio poll. The strace, reflecting a debug println!() in my future's poll(), looked something like this:

 1 2 3 4 5 6 7 8 9101112131415161718192021
epoll_wait(3, [{EPOLLIN|EPOLLOUT, {u32=14, u64=14}}], 1024, -1) = 1
write(5, "\1", 1) = 1
epoll_wait(3, [{EPOLLIN, {u32=4294967295, u64=18446744073709551615}}], 1024, 0) = 1
read(4, "\1", 128) = 1
read(4, 0x7ffc183129d8, 128) = -1 EAGAIN (Resource temporarily unavailable)
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
recvfrom(12, "hello\n", 1024, 0, {sa_family=AF_INET, sin_port=htons(43106), sin_addr=inet_addr("")}, [16]) = 6
getsockname(12, {sa_family=AF_INET, sin_port=htons(2006), sin_addr=inet_addr("")}, [16]) = 0
write(1, "recv 6 bytes from at\n", 52) = 52
recvfrom(12, 0x7f2a11c1c400, 1024, 0, 0x7ffc18312ba0, 0x7ffc18312c4c) = -1 EAGAIN (Resource temporarily unavailable)
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
write(1, "UdpServer::poll()...\n", 21) = 21
epoll_wait(3, [], 1024, 0) = 0
epoll_wait(3, 0x7f2a11c36000, 1024, -1) = ...

Since the concrete internal workings of Tokio and futures were somewhat opaque to me, I suppose I hoped there was some magic routing happening behind the scenes that would only poll the required futures. Of course, armed with a better understanding of Tokio, it's obvious that my program was using futures like this:

This actually works fine, but is not optimal — especially if you have a lot of sockets. Because notifications happen at the task level, any notification arranged in any of the green boxes above will cause the main task to be notified. It will poll its FromAll future, which itself will poll each of its children. What I really need is a simple main future that uses Handle::spawn() to launch each of the other futures in their own tasks, resulting in an arrangement like this:

When any future arranges a notification, it will cause only the future's specific task to be notified, and only that future will be polled. (Recall that "arranging a notification" happens automatically when tokio_core::net::UdpSocket::recv_from() receives WouldBlock from its underlying mio::net::UdpSocket::recv_from() call.) Future combinators are powerful tools for describing protocol flow that would otherwise be implemented in hand-rolled state machines, but it's important to understand where your design may need to support separate tasks that can make progress independently and concurrently.

Final thoughts

Studying the source code of Tokio, Mio, and futures has really helped solidify my comprehension of Tokio, and validates my strategy of clarifying abstractions through the understanding of their concrete implementations. This approach could pose a danger of only learning narrow use cases for the abstractions, so we must consciously consider the concretes as only being examples that shed light on the general cases. Reading the Tokio tutorials after studying the source code, I find myself with a bit of a hindsight bias: Tokio makes sense, and should have been easy to understand to begin with!

I still have a few remaining questions that I'll have to research some other day:

  • Does Tokio deal with the starvation problem of edge triggering? I suppose it could be handled within the future by limiting the number of read/writes in a single poll(). When the limit is reached, the future could return early after explicitly notifying the current task instead of relying on the implicit "schedule-on-WouldBlock" behavior of the Tokio I/O source types, thus allowing other tasks and futures a chance to make progress.
  • Does Tokio support any way of running the event loop itself on multiple threads, instead of relying on finding opportunities to offload work to worker threads to maximize use of processor cores?

UPDATE 2017-12-19: There is a Reddit thread on r/rust discussing this post. Carl Lerche, author of Mio, has posted some informative comments here and here. In addition to addressing the above questions, he notes that FuturesUnordered provides a means of combining futures such that only the relevant child futures will be polled, thus avoiding polling every future as join_all() would, with the tradeoff of additional allocations. Also, a future version of Tokio will be migrating away from the mio::Registration scheme for notifying tasks, which could streamline some of the steps described earlier.

UPDATE 2017-12-21: It looks like Hacker News also had a discussion of this post.

UPDATE 2018-01-26: I created a GitHub repository for my Tokio example code.

The Dream Machine, and highlights from the dawn of computing

It's easy to imagine that computing sprang into existence with the advent of home computers in the late 1970's and 1980's, just as many people have the perception that the Internet sprang into existence in the mid- to late-1990's. In both cases, these technologies crossed thresholds that made them accessible to general consumers, leading to greatly increased usage that makes their pre-consumer existences seem quite meager. However, in the early days great minds were hard at work on developing revolutionary ideas that are so ingrained in computing today that they are largely taken for granted.

I've gained a much better appreciation for this pre-PC era of computing after recently reading The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal, written by M. Mitchell Waldrop and published in 2002. While ostensibly a biography of J.C.R. Licklider, it actually devotes more text to covering the contributions of many other pioneers, including Norbert Wiener, Vannevar Bush, Claude Shannon, Alan Turing, John von Neumann, John McCarthy, Douglas Engelbart, Bob Taylor, and Alan Kay. Licklider's life and career provide the book with a convenient backbone since he crossed paths with so many other notable figures, and also because of the book's focus on how these people helped bring about Licklider's vision of interactive computing as a means of boosting human productivity.

I can definitely recommend this book to anyone interested in the history of computing, as it clearly conveys how our modern world of computing didn't arrive overnight but rather is the result of a long continuum of ideas from throughout the 20th century. In the words of the author:

I finally began to realize that windows, icons, mice, word processors, spreadsheets, and all the other things that seem so important to the modern software industry actually come at the end of the story, not at the beginning.

— Waldrop, M. Mitchell. The Dream Machine: J. C. R. Licklider and the Revolution That Made Computing Personal (p. 472). Kindle edition.

In this post, I'll share some notable highlights from this history that seemed to fall into a few common themes that I found interesting.

The time travelers

Douglas Engelbart delivers what was later known as The Mother of All Demos in 1968.

Some of the pioneers of computing were so far ahead of their time, that they almost seem like time travelers from the future.

  • Vannevar Bush's Memex. In his 1939 draft article submitted to Fortune, Vannevar Bush described a desktop device that would allow a user to instantly access a library of printed material, all of which could be connected with links so the user could effortlessly jump from one resource to another, with the goal of enhancing the productivity of intellectual workers. Although it predates the World Wide Web by 52 years, it's astonishing how similar Bush's Memex system was to hypertext and Wikipedia. The Memex was described as an electromechanical system based on microfilm. While it was never built, the idea inspired many later researchers.
  • The Mother of All Demos. In 1968, Douglas Englebart delivered a groundbreaking technology demonstration showing his team's work at Stanford Research Institute on the "oN-Line System" (NLS) which introduced a number of new ideas such as the mouse, a practical hypertext system, and a raster-scan video output. It integrated a number of pre-existing ideas such as word processing to create a unique system for productivity and online collaboration. The demonstration itself was extraordinarily elaborate for its day, featuring a 22-foot tall projector screen, microwave video links back to SRI, and backstage video mixing. It later came to be known as The Mother of All Demos. It's a testament to how far Englebart's vision has become reality that, after reading about his presentation from 50 years ago, you can follow a link to YouTube and be watching it in seconds.

World War II

Reading the chapters that covered the 1930's and 1940's, I was struck at how much progress was put on hold as World War II threw a colossal non-maskable interrupt at the scientists working on the foundations of computing. Industry and academia shifted gears to support the war effort, and individual innovators put their computing projects aside while they lent their skills to facing the imminent threats. The author concludes that the war ultimately forged the pieces of computer theory "into a unified whole". (Ibid. p. 40) Nonetheless, the book contains a number of examples where critical ideas were put on ice for the duration of the war, such as:

  • Vannevar Bush put his Memex research on hold in 1940 to create and lead the National Defense Research Committee (NDRC) to organize scientific research into defense technology. Norbert Wiener proposed to include digital computer research in the NDRC's scope, but Bush felt the need to prioritize more immediately useful technology such as radar, anti-aircraft fire control, anti-submarine warfare, and the atomic bomb. It wasn't until 1945 that Bush returned to the topic and delivered his influential essay, "As We May Think".
  • Claude Shannon's 1937 master's thesis "A Symbolic Analysis of Relay and Switching Circuits" laid the foundation for digital computing, but his even more revolutionary contributions would have to wait. In 1941, he joined Bell Labs and worked on anti-aircraft fire control and encrypted radio systems during the day, and spent what time he could in the evenings developing his ideas about the fundamentals of communication. It wasn't until 1948 that he published his ideas in "A Mathematical Theory of Communication" and became known as the father of Information Theory.
  • John V. Atanasoff invented the first electronic digital computer in 1939. In 1942 he left academia to oversee the acoustic testing of mines at the Naval Ordnance Laboratory, and continued to work in non-computing fields after the war.

An interesting counterpoint is John von Neumann's career. The war led him to join the Manhattan Project, where his experience with performing large-scale calculations on mechanical tabulators led him to develop his ideas about electronic computing, ultimately leading to his famous von Neumann architecture.

The things we take for granted

Many ideas that seem obvious to us today were actually not obvious at all, but rather had to be invented. Some examples include:

  • Interactive computing. We take it for granted today that you use a computer by entering commands (via the keyboard, mouse, or the touchscreen on your phone or tablet), receive immediate feedback on the screen, then enter more commands. In the 1950's and 1960's, however, such a usage model was not only uncommon, but actually controversial.

    One of the central themes of the book is Licklider's push for interactive computing, where an operator can work with a computer in real time to solve problems through the exploration of ideas, trial and error, and rapid feedback. This went against the prevailing idea of the era that batch processing on centralized computers would always make more sense. Proponents of batch processing maintained that no matter how inexpensive a unit of computation became, it would still be most efficient to concentrate these resources in one giant computer that could focus its entire capacity on performing a sequence of submitted tasks one after the other. Indeed, early attempts to provide interactivity via time-sharing suffered from considerable overhead as the processor had to switch between many users fast enough to maintain the illusion that each user had his or her own computer. Ultimately, the productivity benefits outweighed the overhead, and we all use computers interactively today.

  • Programmability. Even though Charles Babbage had described how a programmable machine might be built in the 1830's, the concept was still exotic in the 1930's. The idea that a machine might make a decision and choose between alternate courses of action was quite radical and sounded eerily similar to thinking. Thus, Howard Aiken's Mark I, one of the first programmable computers, was often called "the electronic brain". (Interestingly, according to Wikipedia, loops were initially implemented on the Mark I by physically creating loops of tape so the same instructions could be fed to the processor repeatedly!)

    Today, programmability seems like the essence of computing. But in those days, even after they invented a means of implementing Babbage's vision, it took a great deal of further thinking about how to harness the power of programming. In 1947, Herman H. Goldstine and John von Neumann published "Planning and Coding Problems for an Electronic Computing Instrument" which outlined techniques such as flow charting and subroutines, and software engineering was born. (Ibid. p. 87)

    Years later, in the 1950's, engineers were still struggling to grasp the complexities of software:

    Lincoln Lab's initial guess for the programming requirements on SAGE — that it would require perhaps a few thousand lines of computer code to run the entire air-defense system — was turning out to be the most laughable underestimate of the whole project. True, the Lincoln Lab team was hardly alone in that regard. Many computer engineers still regarded programming as an afterthought: what could be so hard about writing down a logical sequence of commands?

    — Waldrop, M. Mitchell. The Dream Machine: J. C. R. Licklider and the Revolution That Made Computing Personal (p. 118). Kindle Edition.

  • Binary math. We take it as a given today that binary is the most fundamental unit of information and, given that electrical switches naturally hold a binary state (off or on), the most elegant way of expressing values in a computer. This was not always obvious. Designers of early computing devices assumed that decimal arithmetic was the natural approach, as each digit could store more information and no base conversion of inputs and outputs was needed for the benefit of humans. Manufacturers of mechanical calculators in the 1930's actually used complex systems of gears to perform decimal arithmetic end-to-end. Even Aiken's Mark I, which was proposed in 1937 but not built until 1944, operated directly on decimal numbers.

    Everything changed in 1937 with Claude Shannon's master's thesis, "A Symbolic Analysis of Relay and Switching Circuits", which proposed binary-oriented digital circuits. Independently in 1937 at Bell Labs, George Stibitz invented a binary adding circuit. Shannon's 1948 paper "A Mathematical Theory of Communication" established that the fundamental unit of information is the bit, a term coined by his colleague J.W. Tukey, and the rest is history.


New technology often faces skepticism and opposition from people invested in the existing technology.

  • Packet-switched networks. People invested in the idea of circuit-switched communication were highly resistant to the idea of packet switching. The ARPANET designers were routinely criticized by their Pentagon colleagues and AT&T engineers for their decision to base their network on packet switching, receiving comments such as "this is how a telephone works...", "there's just no way this can work", and "the buffers are going to run out". (Ibid., p. 227)
  • Moore's Law. Jacob Goldman, creator of Xerox's Palo Alto Research Center (PARC), tried to explain Moore's Law to the pencil pushers at Xerox, but they refused to believe that such a thing could even be possible. (Ibid., p. 389)
  • Laser printers. Goldman also struggled with convincing Xerox of the value of one of PARC's legendary inventions, the laser printer. Apparently many at Xerox felt safer going with an alternative printing technology developed by another team, which basically involved glueing a CRT screen to a photocopier. In the end, Goldman somehow managed to convince the product selection committee to embrace the laser printer, which eventually made billions of dollars for Xerox. (Ibid. p. 392)
  • Internet. As late as 1990, it was still hard to sell people on the value of the Internet. Notably, AT&T looked into the business potential of the Internet, but concluded that it couldn't be profitable. (Ibid., pp. 462-463)
Cursive: Writing terminal applications in Rust

As a learning exercise to sharpen my Rust programming skills, I recently toyed with writing a small program that uses a terminal-based user interface which I built using the Cursive crate developed by Alexandre Bury. Cursive provides a high-level framework for building event-driven terminal applications using visual components such as menu bars, text areas, lists, dialog boxes, etc. Conceptually, developing with Cursive is one level of abstraction higher than using a library such as ncurses, which provides a more raw interface to managing screen contents and translating updates to the terminal's native language. In fact, Cursive defaults to using ncurses as one of several possible backends, and allows setting themes to customize various text colors and styles.

Why write terminal applications?

In today's software world, no one writes terminal applications expecting them to be a hit with the masses. Graphical applications (e.g. desktop apps or web apps) provide a uniquely intuitive interface model that allows users to quickly become productive with a minimal learning curve, offer a high-bandwidth flow of information to the user, and remain the only reasonable solution for many problem categories. Many applications would simply not be possible or practical without a GUI. However, terminal programs can find a niche audience in technical users such as software developers and system administrators who are often in need of utilities that are a bit more two-dimensional than the command line's standard input and output, but retain the flexibility to be easily used remotely or on devices of limited capability.

Also, terminal apps are often extremely fast — fast enough to maintain the illusion of the computer being an extension of the mind. I find it frustrating that in 2017 I still spend plenty of time waiting for the computer to do something. Occasionally even typing into a text field in a web browser is laggy on my high-end late-model iMac. For every extra cycle the hardware engineers give us, we software engineers figure out some way to soak it up.

The terminal is not for everyone, but lately I've found it's the one environment that is instantaneous enough that my flow is not thrown off. For kicks, I recently installed XUbuntu on a $150 ARM Chromebook with the idea of mostly just using the terminal (and having a throwaway laptop that I'm not scared to use on the bus/train). I expected to mostly be using it as a dumb terminal to ssh into servers, but to my surprise, it has actually proven to be very capable at performing a wide range of local tasks in the terminal with good performance.

The Cursive framework

Anyone who has developed software with a GUI toolkit (e.g. Windows, GTK+, Java Swing, Cocoa, etc.) will find most Cursive concepts to be very familiar. Visual components are called "views" (some toolkits use use the terms "widget" or "control" for the same concept), and are installed into a tree which is traversed when rendering. Some views may contain child views and are used for layout (e.g. BoxView and LinearLayout), while others are used as leaf nodes that provide information or interact with the user (e.g. Button, EditView, TextView, SliderView, etc.). Cursive can maintain multiple view trees as "screens" which can be switched between. Each screen's view tree has a StackView as the root element, whose children are subtree "layers" that can be pushed and popped.

Cursive provides an event model where the main program invokes Cursive::run() and the Cursive event loop will render views and dispatch to registered callbacks (typically Rust closures) as needed until Cursive::quit() is called, at which time the event loop exits. Alternately, the main program may choose to exercise more control by calling Cursive::step() as needed to perform a single iteration of input processing, event dispatch, and view rendering. Key events are processed by whichever input view currently has focus, and the user may cycle focus using the tab key.

Referencing views

Cursive diverges from other UI toolkits with respect to referencing views. In many environments, we would simply store references or pointers to any views that we need to reference later, in addition to whatever references are needed internally by the view tree to form the parent-child relationships. However, Rust's strict ownership model requires us to be very explicit about how we allow multiple references to the same memory.

After the main program instantiates and configures a view object, it generally adds it to the view tree by making it the child of an existing view (e.g. LinearLayout::add_child()) or adding it to a screen's StackView as a layer. Rust ownership of the object is moved at that time, and it is no longer directly accessible to the main program.

To access specific views after they have been integrated into a view tree, views may be wrapped in an IdView via .with_id(&str) which allows them to be referenced later using the provided string identifier. A borrowed mutable reference to the wrapped view may be retrieved with Cursive::find_id() or a closure operating on the view may be invoked with Cursive::call_on_id(). Under the hood, these methods provide interior mutability by making use of RefCell and its runtime borrow checking to provide the caller with a borrowed mutable reference.

The following code demonstrates how views can be referenced by providing a callback which copies text from one view to the other:

 1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132
extern crate cursive;

use cursive::Cursive;
use cursive::event::Key;
use cursive::view::*;
use cursive::views::*;

fn main() {
    let mut cursive = Cursive::new();

    // Create a view tree with a TextArea for input, and a
    // TextView for output.
    cursive.add_global_callback(Key::Esc, |c| {
        // When the user presses Escape, update the output view
        // with the contents of the input view.
        let input = c.find_id::<TextArea>("input").unwrap();
        let mut output = c.find_id::<TextView>("output").unwrap();

Early in my exploration of Cursive, this method of accessing views proved to be somewhat challenging since fetching references to two views in the same lexical scope would result in BorrowMutError panics, since the internals of the second find_id() would try to mutably borrow a reference to the first view while traversing the tree. Cursive's view lookup code has since been adjusted so that this is no longer an issue.


While developing a full application, I quickly ran into BorrowMutError panics again. With application logic tied to my custom view implementations, and some such code needing to call methods on other custom views, inevitably some code would need to mutably borrow a view that was already borrowed somewhere further up the stack.

My solution was to completely decouple UI concerns from the application logic, resulting in something along the lines of the well-known Model-View-Controller (MVC) design pattern. A Ui struct encapsulates all Cursive operations, and a Controller struct contains all application logic. Each struct contains a message queue which allows one to receive messages sent by the other. These messages are simple enums whose variants may contain associated data specific to the message type.

Instead of calling Cursive::run(), the controller will provide its own main loop where each iteration will operate as follows:

  1. The controller main loop will call Ui::step().
  2. The Ui::step() method will process any messages that the controller may have added to its message queue. These messages allow the controller to change the UI state in various ways.
  3. The Ui::step() method will then step the Cursive UI with Cursive::step(). Cursive will block until input is received. Any pending UI events will be processed and any registered callbacks will be executed. Callbacks may result in messages being posted to the controller's message queue (for example, the contents of a dialog box's form).
  4. The controller main loop will then process any messages that the UI may have added to its message queue. The controller may perform tasks related to these messages, and optionally post messages to the UI's message queue to indicate the outcome.

This scheme worked great for my needs where it's okay for the program to completely block while waiting for user input.

For the message queue, I used Rust's std::sync::mpsc (multi-producer, single consumer FIFO queue), which provides a convenient way for different code components to own a cloned Sender object which inserts elements into a shared queue. The use of mpsc is really overkill for the single-threaded applications I was working with, since any thread synchronization work being performed is wasted.

Here's an example of adapting the above text copy program to such an MVC model. It's admittedly much lengthier.

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99100101102103104105106107108109110111112113114115116117118119120121122123124125126127128
extern crate cursive;

use cursive::Cursive;
use cursive::event::Key;
use cursive::view::*;
use cursive::views::*;
use std::sync::mpsc;

pub struct Ui {
    cursive: Cursive,
    ui_rx: mpsc::Receiver<UiMessage>,
    ui_tx: mpsc::Sender<UiMessage>,
    controller_tx: mpsc::Sender<ControllerMessage>,

pub enum UiMessage {

impl Ui {
    /// Create a new Ui object.  The provided `mpsc` sender will be used
    /// by the UI to send messages to the controller.
    pub fn new(controller_tx: mpsc::Sender<ControllerMessage>) -> Ui {
        let (ui_tx, ui_rx) = mpsc::channel::<UiMessage>();
        let mut ui = Ui {
            cursive: Cursive::new(),
            ui_tx: ui_tx,
            ui_rx: ui_rx,
            controller_tx: controller_tx,

        // Create a view tree with a TextArea for input, and a
        // TextView for output.

        // Configure a callback
        let controller_tx_clone = ui.controller_tx.clone();
        ui.cursive.add_global_callback(Key::Esc, move |c| {
            // When the user presses Escape, send an
            // UpdatedInputAvailable message to the controller.
            let input = c.find_id::<TextArea>("input").unwrap();
            let text = input.get_content().to_owned();

    /// Step the UI by calling into Cursive's step function, then
    /// processing any UI messages.
    pub fn step(&mut self) -> bool {
        if !self.cursive.is_running() {
            return false;

        // Process any pending UI messages
        while let Some(message) = self.ui_rx.try_iter().next() {
            match message {
                UiMessage::UpdateOutput(text) => {
                    let mut output = self.cursive

        // Step the UI


pub struct Controller {
    rx: mpsc::Receiver<ControllerMessage>,
    ui: Ui,

pub enum ControllerMessage {

impl Controller {
    /// Create a new controller
    pub fn new() -> Result<Controller, String> {
        let (tx, rx) = mpsc::channel::<ControllerMessage>();
        Ok(Controller {
            rx: rx,
            ui: Ui::new(tx.clone()),
    /// Run the controller
    pub fn run(&mut self) {
        while self.ui.step() {
            while let Some(message) = self.rx.try_iter().next() {
                // Handle messages arriving from the UI.
                match message {
                    ControllerMessage::UpdatedInputAvailable(text) => {

fn main() {
    // Launch the controller and UI
    let controller = Controller::new();
    match controller {
        Ok(mut controller) =>,
        Err(e) => println!("Error: {}", e),

Miscellaneous notes

  • Cursive is very much a work in progress and there are still some rough edges to be worked out. However, Alexandre Bury is lightning fast at responding to bug reports and fixing issues. One recent issue I filed went from report to patch to commit in 14 minutes.
  • It's unclear how you would develop a lightweight single-threaded program that uses reactor-style asynchronous I/O dispatch. For example, a central select() loop which dispatches stdin/stdout events to Cursive, network socket events to other code, and so on. (I'm not even sure if backends such as ncurses would even support this.)
  • I'm also not sure how I would go about structuring a multi-threaded application where the UI needs to process events from other threads. Cursive does provide a Cursive::set_fps() method which, in conjunction with Cursive::cb_sink(), can poll for new events at specified time intervals. But I've always preferred a purely event-driven design for such things instead of needlessly burning cycles periodically while waiting. (Again, there may be complications at the ncurses layer.)
  • Cursive wants callback closures to have static lifetime, which can lead to some Rust puzzles if you'd like to access non-static non-owned items from within closures. This may be inevitable, and the issue mostly goes away with the MVC decoupling technique mentioned above.

As a learning exercise, I wrote a Cursive-based interface to UPM password manager databases. However, nobody should use it for reasons outlined in its README.

Migrating from Apache Roller to Hugo and Isso

After almost ten years of using Apache Roller to power this blog, I'm making the leap to the Hugo static site generator. Roller served me well, but after years of watching Roller+Tomcat use hundreds of megabytes of memory on my server, I decided that it was overkill for my needs.1 The only major feature which absolutely demands dynamically generated pages is the comments, and I've migrated that functionally to a distinct service using Isso.

My goals are:

  • Reduce the server resource usage.
  • Allow blog posts to be created, managed, and revisioned using the same tools I use to manage software projects — Vim, Git, etc.
  • Reduce the deployment effort of the server-side software. (Like many Go apps, Hugo is a single statically linked binary with no dependencies to worry about.)

For my own future benefit, I'm providing my notes on this migration below.

Migration steps

Most of the migration process was straightforward, although a bit time consuming. I unfortunately don't have a "roller-to-hugo" script that magically converts a blog, as several categories of content assets needed to be manually adapted to the Hugo way of doing things. I can outline the basic steps, though:

  1. Create a Hugo theme. I wanted the blog to look and feel the same in Hugo as it did in Roller, so I needed to create a custom Hugo theme. I was already using custom stylesheets and Velocity templates on Roller, so simply extracting these assets from the database's webpage table into files got me most of the way there. I then needed to touch up the files to convert Velocity markup into Go templates and adapt the pagination scheme.
  2. Port blog posts. Roller stores blog entries in the weblogentry table, and their associated tags in the roller_weblogentrytag table. I wrote a one-off Python script to create Hugo content files out of this data.
  3. Port static content. This was simply a matter of finding the Roller resources directory in the filesystem, and copying it to the Hugo static/resources directory.
  4. Port RSS and Atom feeds. The current version of Hugo does not have built-in support for Atom feeds, so I used a template-generated solution as described here. I also needed to update the <head><link ... /></head> references in my templates to point to the new feed URLs.
  5. Port comments. I used the Isso comment server to support comments. Isso works similarly to Disqus, except it is self-hosted.
    1. Install Isso in a Docker container for isolation and ease of management.
    2. Map the blog's /isso URLs to Isso.
    3. Add the client HTML bits to inject the Isso comments into pages.
    4. Configure Isso: Basic configuration (dbpath, host, [server].listen), logging, SMTP notifications, moderation, and guard settings (rate limits).
    5. Import comments. I wrote a one-off Python script to import comments, paying careful attention to properly initialize the voters Bloom filter bitmask in Isso's SQLite database.
  6. Map old Roller URLs to Hugo URLs. I configured some 301 (permanent) redirects on my web server so that existing links to block posts and feed URLs will continue to work.

I have a few ugly Python scripts for migrating data from Roller to Hugo/Isso, but I'll hold off on posting them unless someone really wants to see them.

Pros and cons

Hugo pros:

  • When I first started working through the Hugo Quickstart Guide, it seemed like a lot of steps. However, after playing around with it for a while, everything seems really easy and straightforward.
  • As expected, the resource usage is low. Hugo is a single, self-contained ~6MB binary. Since static pages are generated, there is no persistent resource usage.
  • Blog posts can be composed completely offline, and tested using Hugo's built-in web server. When ready, I can push the Git commit(s) and rebuild the site on the server. Rebuilding my blog from scratch takes about 200 milliseconds.

Hugo cons:

  • No built-in support for Atom feeds. (But it's easy to add via a template.)
  • It's not obvious how trackbacks would be implemented with Hugo.
  • Dynamic web apps have the luxury of providing the correct MIME type with every document that is delivered. Since Hugo is generating static files, I now rely on the web server to determine the MIME type based on filename extensions. This may be an obstacle to preserving some URL schemes when migrating to Hugo.2 I ended up restructuring the web site to use Hugo-friendly URLs, and adding permanent redirects to map old Roller URLs to Hugo URLs.

Isso pros:

  • As a self-hosted solution, Isso avoids some of the privacy concerns that people have with third-party solutions such as Disqus.
  • Notification and moderation of comments via mail.

Isso cons:

  • Isso is a very simple, no-frills service. It accepts and regurgitates comments, but not much more.
  • There is no visible feedback when the guard rate-limits are hit, so the user doesn't receive any hint about why the comment is not being posted.
  • It doesn't seem practical to add the comment count below each entry on the main page, as I did with Roller.
  • I haven't figured out how to configure Isso to use my correct base URL in mail notifications, so I have to tweak the URLs when approving or deleting comments. (The host option seems to not be useful here.)
  • Isso seems to wake up every 500 milliseconds to do something, even when it is not being actively used:
    # strace -p 4890
    strace: Process 4890 attached
    select(5, [4], [], [], {0, 85673})      = 0 (Timeout)
    select(5, [4], [], [], {0, 500000})     = 0 (Timeout)
    select(5, [4], [], [], {0, 500000})     = 0 (Timeout)
    select(5, [4], [], [], {0, 500000})     = 0 (Timeout)
    select(5, [4], [], [], {0, 500000})     = 0 (Timeout)
    Perhaps this is a function of Werkzeug. Despite the 500ms wakeup, the CPU utilization seems to be negligible.


  1. To be fair, there are probably lots of opportunities to tweak the parameters of Tomcat and Roller to tune the resource usage. Perhaps the JVM heap size and/or the size of internal caches could be adjusted. Also, memory usage of specific services on Linux can be notoriously difficult to determine. (Top is currently showing that resident memory usage of my Tomcat server is 286MB.)
  2. I suppose someone could manually add web server configuration rules to match URL patterns to the right MIME types, but this seems needlessly manual and brittle.
  3. Isn't it fun reading through all the footnotes?
Reinventing software for security; or: The woodpeckers are coming.

The past couple of years have been tough for digital security. A few disasters and near-disasters include:

  • Heartbleed, a buffer over-read vulnerability in OpenSSL allowing unauthorized remote access to data which may contain private keys.
  • Shellshock, an issue with Bash allowing remote code execution in many varied scenarios.
  • A bug in Microsoft's SSL/TLS library (Schannel) allowing remote code execution.
  • POODLE, a flaw in the SSLv3 protocol that an attacker can leverage on many connections by forcing a protocol downgrade, or relying on certain flaws in TLS implementations.
  • Attackers' increasing boldness in targeting networks for financial gain (Target, Home Depot) or cybervandalism (Sony Pictures), resulting in hundreds of millions — or perhaps even billions — of dollars in damages.
  • A rising awareness of state-sponsored attacks, from actors such as the NSA (Regin malware), the UK's GCHQ (Belgacom attack), and North Korea (alleged perpetrator of the Sony Pictures attack).

How did our infrastructure become so fragile? How did the miracles of technology turn against us? Who is responsible for this? Regrettably, my fellow software engineers and I are largely responsible. Together, we have created this frightening new world where people's property, finances, and privacy are at risk.

“If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.” — Gerald Weinberg

Weinberg's famous quote about software quality points out a lack of rigor that can be seen in the software industry for decades. In the 1970's and 1980's, the nascent Internet was a more civilized place, similar to a small town where people felt comfortable leaving their front doors unlocked. Accordingly, we built software with little consideration of security. Unencrypted communication protocols like telnet would happily share your passwords with any eavesdropper, and lax security in other network services would eventually expose unexpected attack modes that were perhaps obvious only in hindsight. In the 1990's and 2000's, we wised up with better encryption, authentication, authorization, and recognition of security as an explicit engineering goal. (As far as I can tell, the first RFC with a dedicated “Security Considerations” section was RFC 1060 from March 1990.)

However, although we managed to lock the front door, we left our systems vulnerable in many other ways. Memory safety errors, unexpected consequences emerging from complexity, and numerous mundane code correctness issues provided attackers with a seemingly endless toolkit for compromising systems.

Many other engineering disciplines have the benefit of hundreds or thousands of years of accumulated wisdom that have resulted in highly refined tools and methods. Designing bridges or buildings, for example, is a well-understood process. We've only been developing software for about 60 years, and only been developing software at a large scale for maybe 30-40 years. Our field is very much still in its infancy: our tools are sorely lacking, our methods tend to be ad-hoc, and lack of experience leads us to be overconfident in our ability to produce correct code. Our products often fail to provide the basic functions expected by the user, much less withstand attacks by a thinking, creative adversary. It pains me that we've let down our employers, customers, and users by producing such flawed products.

Software development must reinvented. We need better tools and methods to build more reliable software, and an environment that values security and rewards engineers and companies for producing such software. These things are easier said than done, and I don't have all the solutions. I do know that it's time to start working on solutions. The threat level is not going down any time soon. In fact, I expect it to rise with our increased reliance on software systems and as recent high-profile attacks show the world's miscreants just how vulnerable we are.

The woodpeckers are coming.

Limitations of defensive technology

The industry's solution is to double down on defensive technology — malware scanners, firewalls, intrusion detection appliances, and similar systems. While these play an important role, it is increasingly difficult for defensive systems to shoulder the entire burden of security while an army of software engineers continues to supply a never-ending fountain of vulnerabilities. Firewalls become less effective as more software integrates firewall-bypassing communication channels with cloud services, attackers seek to exploit flaws in such software, and malware is distributed out-of-band. Malware scanners especially face tough challenges as fully metamorphic viruses are already extremely difficult to detect, and likely have a lot more opportunities for improvement than the scanners have options for improving detection.

Ultimately, software engineers are able to create security problems much faster than producers of defensive products can figure out ways to contain them. We must stop thinking of security in terms of band-aids, and address the source of the problem by developing software that is secure by design.

Attacking attack vectors with better tools and methods

We can broadly divide the attack universe into two categories:

  • Software engineering attack vectors. This includes programming issues such as memory safety and code correctness, and system design issues dealing with authentication schemes, cryptosystems, protocols, complexity management, and the user experience.
  • Other attack vectors found in system administration, configuration, networking, wiring, physical side channel emissions, passwords, social engineering, operational security, and physical security.

As a software engineer interested in improving software engineering, I'm focused on the former category. Examining a few of the recent high-profile vulnerabilities is useful for thinking about how we can approach certain attack vector categories.

Heartbleed and memory safety

“Whenever I go to and look at the latest security fixes, the vast majority of them involve memory safety issues, which only appear in unsafe languages such as C and C++.”
user54609, Information Security Stack Exchange

Memory safety issues are behind a huge chunk of vulnerabilities, such as OpenSSL's Heartbleed. Much security-sensitive code is written in low-level languages because we seek performance, minimal memory footprint, minimal dependencies, interoperability, and sometimes fine-grain control over execution. This is especially true for cryptography, where we'd like the CPU overhead to be as close to zero as possible, and avoid potential timing attacks that could arise from high-level language execution. However, developing complex systems in C and C++ can require a superhuman level of attention to detail to avoid memory errors, and even the most capable programmers seem to let such errors slip through on occasion. Although techniques exist to help minimize such errors (e.g. C++ smart pointers), it may not be possible to develop a large, complex C/C++ program with a high assurance of correct memory usage.

Fortunately, there has been much interest lately in developing new low-level languages with memory safety assurances. My favorite of these is currently Rust, which promises zero-cost memory safety by requiring that the programmer adhere to a certain memory management discipline. Rust is the most promising step toward reinventing software that I see today. If our critical low-level infrastructure was written in Rust instead of C/C++, we would be far more secure. Heartbleed would not have happened if OpenSSL was written in Rust.

Rust is still a work in progress, can be difficult to use, and even a fully mature Rust may not be the final solution. Other new languages also have merit. The Go programming language looks promising and is quite a bit more mature than Rust. However, Go's mandatory garbage collection may exclude it from certain applications, such as operating system kernels, real-time tasks, or possibly cryptography. (It's not clear to me if garbage collection can contribute to timing side channels in cipher implementations. I'd love to see some research on this.)

When it comes to memory safety bugs, the path ahead is refreshingly clear: new high-performance, low-level programming languages that prevent these bugs from happening. Unfortunately, general solutions for other classes of bugs remain murky.

Shellshock and emergent vulnerabilities

"So who's to blame? Everybody and nobody. The system is so complex that unwanted behaviours like these emerge by themselves, as a result of the way the components are connected and interact together. There is no single master architect that could've anticipated and guarded against this."
Senko Rasic on Shellshock

The Shellshock vulnerability in Bash is a great example for reminding us that some threats can be created even with the most logically consistent and memory-safe code. Writing Bash in a rigorous language such as Rust would not have prevented Shellshock from happening, nor would any amount of static analysis have revealed the problem. Shellshock arises from a feature added to Bash in 1992 for passing shell functions to child Bash processes using environment variables. The feature seems to be implemented by passing the environment variable's value directly to Bash's interpreter, as commands provided after the close of the function definition will be parsed and executed immediately. This probably seemed like a reasonable feature in 1992, but it became a devastating vulnerability when Bash became the glue tying network services to scripts (e.g. web servers to CGI scripts, or DHCP clients to hook scripts), and environment variables could suddenly contain hostile payloads, thus providing remote code execution to external parties.

It would have been nice if the troublesome feature halted interpretation at the end of the function definition, but even provisioning functions from environment variables was something that network service developers could not have anticipated. Indeed, they probably didn't anticipate the use of Bash at all — they were merely passing data to a child process in a generic fashion, and the use of Bash was often simply a result of how the system administrator or the distribution maintainer connected the pieces. Thus, Shellshock falls into an elusive category of emergent vulnerabilities that can arise in complex systems.

This class of vulnerability is particularly disturbing since most software is built around the idea of reusable modules of code, many of which may be supplied by external vendors, and connected in a vast number of combinations. We need engineering methods for dealing with this complexity, but I'm not sure exactly what these would be. Perhaps interface definitions between software components could make formal guarantees about how the passed data will be used.

Apple's “goto fail” bug and code correctness

Apple's goto fail bug, revealed in February 2014, prevented signature verification from happening properly in TLS handshakes, thus allowing man-in-the-middle attacks. The cause of the bug was a minor typo in the source code which led to unintended behavior. The program was incorrect — its behavior did not match its specification. Incorrect code can be produced by even the very best programmers, since these programmers are human beings and will occasionally make human mistakes.

Mike Bland believes that the “goto fail” bug could have been avoided by promoting a unit test culture, and Adam Langley suggests code reviews. These are both great ideas, especially for such critical code. However, I wonder if there are ways we can avoid creating these errors to begin with, instead of hoping to catch them later in a mop-up phase. Would use of functional languages like Haskell help us better express our intentions? Could formal methods and formal specifications be useful for catching such implementation errors?

POODLE and the trouble with cryptographic protocols and implementations

The POODLE attack revealed in September 2014 allows attackers to target secure connections protected with correct SSL 3.0 implementations, or TLS implementations with certain coding errors. (Although SSL 3.0 is 18 years old and seldom used in normal operation, this is still quite concerning as an attacker can use a forced downgrade attack to cause an SSL 3.0 session to be negotiated.) This reminds us that bugs can exist in protocols themselves, and cryptography can be enormously difficult to implement correctly. It's not good enough for cryptography implementations to properly encode and decode — to be secure, they must be mindful to a long list of small details involving parsing, padding, execution time (to avoid timing side channels), proper use of random number generators, and many more.

The best bits of advice I've heard about implementing cryptography are:

  • Practice extreme humility — overconfidence is the enemy of security. Know that no matter how good you are, your fresh cryptographic code is likely to have subtle problems.
  • Reuse existing cryptographic code modules whenever possible, preferably modules that have been audited, rigorously tested, and battle-hardened through their production use. As full of holes as OpenSSL is thought to be, it is probably more secure than whatever you would write to replace it. Better yet, consider opinionated toolkits such as the Sodium crypto library.
  • Seek expert assistance from professional cryptographers and security experts, when possible. There are people out there who have made it their life's work to study cryptography and its practical use, although they are probably not cheap.
  • Commission third-party security audits. When we programmers look at the same body of code for weeks at a time, we often lose the ability to view it critically. Fresh eyes can be invaluable.

The best engineering improvement I can think of is the use of domain-specific languages to specify protocols and algorithms, as this may help avoid the pitfalls of implementing cryptography in general purpose languages. I'm encouraged by projects such as Nick Mathewson's Trunnel, a binary parser generator for protocols.

Economics of secure software

“It's a valid business decision to accept the risk [of a security breach]... I will not invest $10 million to avoid a possible $1 million loss.”
— Jason Spaltro, senior vice president of information security, Sony Pictures, in a 2007 interview with CIO.

From individual consumers to the largest companies, security often seems to be valued rather low. Mr. Spaltro's unfortunate cost-benefit analysis has been mentioned often in the days since the devastating Sony Pictures attack was made public. However, I doubt his thinking was too far out of line with others at the time. In most organizations, information technology is a cost center that does not directly contribute to the bottom line, so it's understandable that companies would seek to minimize its expense. There is probably considerable temptation to underestimate the cost of breaches. This is regrettable, as even with improved engineering tools and methods, the financial investment needed to develop, audit, and deploy improved software may be quite large. I suspect companies such as Sony, Target, and Home Depot now have a better understanding of risks and may be willing to invest more money into security. Hopefully some of their security budget will include software better engineered for security, whether supplied by external vendors or developed in-house. In the end, it may take hundreds of billions or even trillions of dollars to rebuild our software foundations.

One great puzzle is figuring out how to fund the development and auditing of open-source software. Much of the technology we use every day relies on various open-source software modules under the hood, and our security relies on these modules being reliable. Additionally, the inherent auditability of open-source software makes it important for resisting attempts by governments to weaken security by coercing companies to include intentional flaws in their software. Of course, simply being open-source does not automatically make software more trustworthy. Being open-source is necessary but not sufficient. There is not an army of bored software engineers browsing through GitHub projects looking for flaws because they think it's a fun way to spend a Saturday night. With the right funding, though, we can pay qualified experts to conduct thorough audits.

I'm highly encouraged by the efforts of several groups to help fund audits and other security investigations, whether their motivations arise from their reliance on the security of the targeted software, positive public relations, self-promotion, or something else entirely. For example, the Open Crypto Audit Project is funding the necessary auditing of critical open-source projects. Although their visible efforts to date have been limited to a crowdfunded audit of TrueCrypt, Kenneth White spoke at last summer's DEFCON about their intention to begin an audit of OpenSSL funded by the Linux Foundation's Core Infrastructure Initiative, which itself is funded by a long list of big names such as Google, Intel, Microsoft, and Amazon. Such investment from stakeholders to fund security audits seems like a very reasonable approach. Likewise, Google's Project Zero is a team of security researchers tasked with improving the security of all commonly used software. Even some security consultancies are finding the time for pro bono investigations, such as with the Cryptography Services effort.

I'm optimistic about the improvement of many classes of software being driven by increased demand from businesses. Selling end users on the idea of paying for security may be a much tougher challenge in a market dominated by free advertiser-sponsored software and services (e.g. mobile apps, popular web sites, etc.). We have much more work ahead of us to construct a workable value proposition for this market.


Looking at the current state of software security and the harm of recent attacks can be a bit of a downer, but I remain optimistic that we can fix many of the problems with better engineering and better funding. What can we do to push the state of software engineering forward and create a more secure world?

  • Study new programming languages built for better memory safety without sacrificing high performance. Think about which critical software modules might be best suited for implementation in these languages, and which can be implemented in high-level languages. If you must use C++, learn the latest techniques for helping improve memory safety.
  • Develop new abstractions that may improve software reliability, such as generators for protocol handlers and cryptography algorithms.
  • Think about engineering methods that may improve code correctness, and how they can be applied to existing software development processes.
  • Develop funding mechanisms and more compelling end-user value propositions so that software engineers working on better security can be rewarded by those who value it.

I'd love to hear about any ideas you may have about making the world's software infrastructure more resilient to attack.

Battery cost of periodic mobile network use

As the consumer electronics revolution brings more and more of the digital world to handheld devices, the chief constraint developers often face is not bandwidth or CPU cycles, but rather battery life. Since many next-generation applications require creative use of the network, I decided to run a few tests to discover the true battery cost of network use in certain scenarios.

I have an interest in mesh overlay networks, and I'm curious about the cost of mesh maintenance on power-constrained devices. Therefore, these tests explore the use of relatively small network transactions performed at regular intervals. Mobile devices are known to be optimized for aggregated time-adjacent traffic, with the radio wake-up cost leading to a low ROI for small (e.g. one IP packet) transmissions. This could unfortunately be bad news for mesh maintenance, where lots of small transmissions are spread out in time.

Ilya Grigorik's High Performance Browser Networking (O'Reilly Media, 2013) goes into detail about these issues in Chapter 7 "Mobile Networks" and Chapter 8 "Optimizing for Mobile Networks". Some key insights from this work include:

  • "The "energy tails" generated by the timer-driven state transitions make periodic transfers a very inefficient network access pattern on mobile networks."
  • "Radio use has a nonlinear energy profile with respect to data transferred."
  • "Intermittent network access is a performance anti-pattern on mobile networks..."

Test methodology

Earlier this year when I was switching carriers, I found myself with a spare Android handset with LTE service enabled. Seizing the opportunity of having an activated handset that's not saddled with my usual array of chatty apps (mail, Twitter, etc.), I ran a series of tests measuring battery drain in various controlled network conditions.

The handset under test was a Samsung Galaxy Nexus running Android 4.3, equipped with the factory supplied 1850mAh battery. The network connection was provided by Sprint's LTE network. To reduce the amount of unintentional background network traffic, the device was reset to factory defaults and not associated with any Google account.

I developed an app to perform network traffic at a specified interval and record the battery level and network counters each minute. This app sends a 1400-byte UDP packet as an echo request to a cloud server, where a small Python script verifies the authenticity of the request and returns a 1400-byte UDP echo response packet. In this fashion, network traffic should be roughly balanced between upload and download, minus any occasional packet loss.

To judge the overall battery usage for an individual test with a specific network transaction frequency, I measured the time elapsed while the battery drained from 90% to 30%. (Battery usage was seen to have some aberrations above 90% and below 30%, so such data was discarded for the purpose of calculating drainage times.)


This is not a fully controlled laboratory test or representative of a broad range of devices and networks, but rather a "best-effort only" test using the equipment at hand. Thus, it's important to keep in mind a number of caveats:

  • The LTE signal strength is not guaranteed to be constant throughout the test. I tried to minimize the variation by always performing tests with the handset in the same physical location and orientation, but there are many factors out of my control. A lower signal strength requires the radio to transmit with higher power to reach the tower, so this could add noise to the data.
  • LTE is something of a black box to me, so any peculiarities of the physical and link layers are not taken into account. For example, are there conditions that may prompt the connection to shift to a different band with different transmit power requirements?
  • Other wireless providers may use LTE in different frequencies or configurations which may affect the battery usage in different ways.
  • Android's background network traffic could not be 100% silenced, and I did not go to extraordinary lengths to track down every last built-in app that occasionally uses the network. However, this unintentional traffic should be fairly negligible.
  • This test only considers one specific mobile device with one operating system. Other models will have radios with different power usage characteristics.
  • Wi-Fi use is not tested.


Note that the echo frequency above is in millihertz (mHz), not megahertz (MHz) — 1000mHz is 1 echo request/response per second.


One surprising result was the battery longevity in the control test. While most of us have grown accustomed to charging our mobile devices every day, it turns out that with minimal network activity, they can last quite a long time indeed. In this case, the Galaxy Nexus lasts almost three days while associated with an LTE tower.

As expected, the relationship between periodic network use and battery drainage is non-linear. For example, doubling the transaction frequency — say, from 128-second intervals to 64-second intervals — doesn't halve the 90-30% drain time; it only lowers it by 32%. Additionally, there seems to be a leveling out around 8-second (and shorter) intervals. Perhaps certain radio components never power down with such frequent transmissions.

Overall, the situation looks pretty grim for mobile devices being full, continuous participants in mesh overlay networks. The modest bandwidth needs of such applications are overshadowed by the battery impact of using the network in little sips throughout the day. Perhaps a system where all participants agreed to a synchronized schedule for mesh maintenance activities could mitigate the problem, but the benefits are not clear when combined with real-time mesh events instigated by remote users (say, a Kademlia node lookup).

It might be interesting to evaluate the impact of periodic network use with Wi-Fi, or investigate the techniques used by platform push systems such as Google Cloud Messaging and the Apple Push Notification service.

Highlights of DEFCON 22

The twenty-second DEFCON took over Las Vegas last week, and brought many interesting and notable speakers. I took a few notes from the talks that stood out to me, and I'm passing them along here.

Paul Vixie, Internet pioneer and DNS expert. Vixie spoke about his DNSDB project for accumulating global DNS resource records in a passive fashion, and making this information available to researchers and security product vendors. He also spoke about his DNS firewall for shielding users from malicious throwaway domain names.

Phil Zimmerman, creator of PGP and president of Silent Circle. Zimmerman spoke about wiretap overcompliance in the telecommunications industry, trust in cryptographic techniques, and his new endeavors at Silent Circle. Reading about Zimmerman's PGP efforts and the resulting drama (PGP: Pretty Good Privacy, Simson Garfinkel) is what got me interested in cryptography many years ago, so it was great to see a living legend on the stage. I did take issues with a few of his comments, though. When asked about trusting binary executables, Zimmerman mentioned the problem of distributing a binary which is identical to one that might be produced from source, due to differences in timestamps — and failed to discuss recent progress in reproducible build techniques which are meant to solve that problem. He also painted a somewhat rosy picture of the legislative attitude towards cryptography and privacy: we won the Crypto Wars in the 1990's, and cryptographic freedom can't be rolled back again now that everyone relies on it. This does not seem to be the case — last year, Congress and the administration was pushing a proposal which would effectively outlaw peer-to-peer communication systems that might be problematic to wiretap. (Thankfully, the Snowden revelations made the proposal politically toxic for now, and it has been shelved.)

Kenneth White, security researcher. White spoke about the Open Crypto Audit project which he launched along with cryptographer Matthew Green, and the drama caused by their first audit subject, TrueCrypt, being suddenly discontinued under mysterious circumstances. I've followed the progress of the Open Crypto Audit project and the ongoing news about the TrueCrypt disappearance, so there wasn't much in the talk that was new to me. It was interesting to hear that some of the biggest challenges of Open Crypt Audit were the community aspects of audit fundraising. White reported that they will finish the TrueCrypt audit in spite of the shutdown, and then move on to OpenSSL.

Dan Kaminsky, security researcher. Kaminsky scored a coveted two-hour slot in the Penn and Teller theater, which he fully used to discuss a variety of topics:

  • Secure random by default. Kaminsky argued that most vulnerabilities resulting from random number generation are not due to exotic attacks on complex algorithms, but rather gross missteps in the use and generation of randomness. For instance, some software has been observed to only effectively use 32 bits of entropy, while others employ the use of linear feedback shift registers (LFSRs) in spite of their easy cryptanalysis. Kaminsky proposes a new Liburandy library which wraps /dev/urandom when appropriate.
  • Storybits. Kaminsky invited Ryan Castellucci onto the stage to demonstrate Storybits 0.1, a new cryptomnemonic scheme for people to remember binary strings such as keys, fingerprints, secrets, etc. The system encodes the data as adjective-noun-verb tuples to make the data easier to remember, and provide error correction by way of spellcheck auto-correct.
  • Memory hardening. Convinced that improper memory usage is a major cause of vulnerabilities, Kaminsky outlined several strategies for memory-hardening applications. These include use of a typed heap (as Google does in Chrome), the use of nondeterministic freeing (as Microsoft does in Internet Explorer), and a novel approach called IronHeap where 64-bit virtual memory addresses are simply never freed (although pages may be returned for MMU reuse). He also announced the formation of a team to memory-harden Firefox, to provide added security for the Tor Browser Bundle.
  • Distributed Denial of Service (DDoS) mitigation. Kaminsky considers the rise of DDoS attacks using techniques such as datagram amplification to be an existential threat to the Internet. He proposes a new scheme of sending tracer packets within data flows to indicate when source address spoofing may be happening.
  • NSA. Kaminsky is concerned that the NSA backlash may lead to a balkanization of the Internet, as various nations opt to develop their own internal systems for core Internet services.
  • Apple bug bounties. Finally, Kaminsky is quite happy that Apple is offering bug bounties relating to Safari autoredirection.

Kaminsky's slides are available.

Ladar Levison, founder of Lavabit. Levison spoke about his proposed Dark Mail Alliance, a new electronic mail system designed to preserve the privacy of users. He began by announcing a new name for the project: DIME, the Dark Internet Mail Environment. I was a bit disappointed in the new name — "Dark" can have a sinister connotation for some people, and privacy preserving technologies should be marketed to the public with positive names reflecting the true value they provide. He should have renamed the project TIME, the Trustworthy Internet Mail Environment. Levison outlined the basic components of the system, including a server called Magma and a modified Thunderbird client called Volcano. DIME unfortunately does not provide forward secrecy for messages, although Levison pointed out that there was forward secrecy at the TLS1.2 line level. There was also talk of a pseudo-onion scheme to shield metadata and provide some small measure of anonymity, but it wasn't clear to me how this was implemented.

Adam Caudill, software developer and security researcher. In DEFCON's new Crypto Village, Caudill proposed a new secure electronic mail system called Simple Messaging and Identity Management Protocol (SMIMP). This scheme shares some of the same goals as Levison's DIME, but provides an alternative design intended to be developed in the open among the greater Internet engineering community. The most interesting thing to me was a Hashcash-like proof-of-work requirement for reducing spam.

Recent Android "Package file is invalid" errors

In the past day or so, I've been noticing these "Package file is invalid" errors on my Android devices while trying to upgrade or install certain packages from the Play Store. A bit of searching revealed that many others are having this problem, and various home remedies abound for trying to fix it, such as clearing the Play Store's app cache. Unfortunately, while these remedies may have worked for past problems that led to this error message being displayed, they are useless when trying to fix the issue people are experiencing this weekend.

I decided to do a bit of digging, and I found that Google's web servers are actually sending corrupted packages to the Play Store app. Therefore, no amount of tweaking your device will fix the problem. (Unless such tweaking happens to result in pulling packages from a different web server that doesn't have corrupted files, I suppose.)

UPDATE 2013-08-12: It appears that this problem is isolated to one or more specific servers on Google Play's content distribution network -- if your closest server has corruption, you'll always see this issue unless you move to a different network and a different server is selected. I see the problem here in Colorado, and a brief Twitter survey shows a high concentration of complaints from the U.S. Midwest and Great Lakes region. Suggestions to use a VPN have some merit -- when I VPN into Dallas, I can successfully update/install these problematic packages, because a non-corrupted server is chosen in that case. (Obviously this isn't a great solution.)

UPDATE 2013-08-13: I heard from a Google Play engineer today. It sounds like they're in the process of rolling out a fix, so our package updates and installs should be back to normal very soon!

I've observed this problem on the following devices:

  • Galaxy Nexus (Android 4.2)
  • Nexus 10 (Android 4.3)

To investigate the problem, I tried downloading the recently released Twitter 4.1.4 package, and compared the downloaded package file (temporarily stored in /data/data/ to a known good version.

A hex dump of an uncorrupted Twitter 4.1.4 package looks like this around offset 0x0200000:

01fffc0: 6e69 2067 6fcc 8872 6d65 6b2e 0028 2b42  ni go..rmek..(+B
01fffd0: 6972 2069 6e73 616e 206d c4b1 73c4 b16e  ir insan m..s..n
01fffe0: 2079 6f6b 7361 2062 6972 2062 696c 6769   yoksa bir bilgi
01ffff0: 7361 7961 7220 6dc4 b13f 000c 0c42 6f79  sayar m..?...Boy
0200000: 7574 3a20 252e 3166 6b00 0f11 4b6f 6e75  ut: %.1fk...Konu
0200010: 6d75 2064 65c4 9f69 c59f 7469 7200 0303  mu de..i..tir...
0200020: 5369 6c00 2122 2225 3124 7322 2022 2532  Sil.!""%1$s" "%2
0200030: 2473 2220 6c69 7374 6573 696e 6920 6f6c  $s" listesini ol

A hex dump of the corrupted Twitter apk looks like this around offset 0x0200000:

01fffc0: 6e69 2067 6fcc 8872 6d65 6b2e 0028 2b42  ni go..rmek..(+B
01fffd0: 6972 2069 6e73 616e 206d c4b1 73c4 b16e  ir insan m..s..n
01fffe0: 2079 6f6b 7361 2062 6972 2062 696c 6769   yoksa bir bilgi
01ffff0: 504b 0304 1400 0800 0800 e27c 0543 2d70  PK.........|.C-p
0200000: 8d5b c420 0100 986f 0200 1d00 0400 6173  .[.
0200010: 7365 7473 2f66 6f6e 7473 2f52 6f62 6f74  sets/fonts/Robot
0200020: 6f2d 4c69 6768 742e 7474 66fe ca00 00ec  o-Light.ttf.....
0200030: 9d07 7c54 55fa f74f 994c 0a21 bd00 8190  ..|TU..O.L.!....

At 16 bytes before the 2-megabyte mark, the corrupted file begins repeating the contents of the beginning of the file, including the ZIP header. It looks like a common programming error when dealing with buffered I/O streams. I first suspected that the Play Store app or the Android framework on my devices had such an error, but then I used tcpdump to examine the actual HTTP traffic as seen from my router:

22:01:25.861259 IP 74.125.x.x.80 > 192.168.x.x.39431: Flags [.], seq 2097056:2098516, ack 527, win 245, length 1460
0x0230:  2073 cca7 6966 7265 6e69 2067 6fcc 8872  .s..ifreni.go..r
0x0240:  6d65 6b2e 0028 2b42 6972 2069 6e73 616e  mek..(+Bir.insan
0x0250:  206d c4b1 73c4 b16e 2079 6f6b 7361 2062  .m..s..n.yoksa.b
0x0260:  6972 2062 696c 6769 504b 0304 1400 0800  ir.bilgiPK......
0x0270:  0800 e27c 0543 2d70 8d5b c420 0100 986f  ...|.C-p.[.....o
0x0280:  0200 1d00 0400 6173 7365 7473 2f66 6f6e  ......assets/fon
0x0290:  7473 2f52 6f62 6f74 6f2d 4c69 6768 742e  ts/Roboto-Light.
0x02a0:  7474 66fe ca00 00ec 9d07 7c54 55fa f74f  ttf.......|TU..O

Sure enough, the corruption was present in the stream as sent from Google's web server. I assume that the bug is in Google's web server code, or in some intermediate package processing step at the Play Store. Either way, we'll just have to wait for Google to fix the glitch.

Google Fiber Tourism: Plugging into the glass at the Kansas City Hacker House

While finishing up my holiday travel, I decided to stop in for a couple of days at the Kansas City Hacker House, a place for aspiring technology entrepreneurs to live and work on their projects while connected to the Google Fiber gigabit network. Unlike my previous Google Fiber experience, I had an opportunity to plug my laptop directly into the network via gigabit ethernet and run some more tests.

Legacy Tests

I first ran a few tests of legacy network usage -- uploading and downloading large files from various services.

testfile sizetimeeffective bitrate
Google Drive - upload256MB400 seconds5.3687 Mbps
Google Drive - download256MB289 seconds7.4307 Mbps
Dropbox - upload256MB31.7 seconds67.655 Mbps
Dropbox - download256MB67.6 seconds31.779 Mbps
Ubuntu 12.10 ( seconds102.86 Mbps
Ubuntu 12.10 (bittorrent)753.293MB342 seconds18.477 Mbps (peak 31.932)
Linux Mint 12 (bittorrent; 72/325 seeds)1027.468MB283 seconds30.456 Mbps

It looks like Google Drive wasn't having a good day. Dropbox, on the other hand, really screamed. (Although not as much as you might expect on a gigabit connection.) It was nice to be able to download Ubuntu in 61 seconds from a well-connected server. Bittorrent didn't perform well, though -- I suspect you'd need to be downloading a much larger file from many more seeds before you'd see Bittorrent have time to ramp up the connections and compare favorably.

All tests were performed to and from a local ramdisk, to avoid any hard drive I/O bottlenecks. However, the remote servers are likely using spinning disks that are contending with many other users. tests

The tests really aren't very useful for Google Fiber, since the servers aren't really set up for measuring high-bandwidth connections. You really end up measuring the server's capabilities and the throughput of various intermediate networks. Nevertheless, here are a couple of tests:

I tested with several other servers, and all the results varied too much to be useful.

Google Fiber Speed Test

To provide users with a reliable way of measuring the bandwidth to their home, Google provides a Google Fiber Speed Test for testing the connection from the home to a server on the Google Fiber network. (Google Fiber customers can access the server, but it doesn't appear to be accessible from the outside.)

The primary differences between Google's speed tests and the other speed tests seem to be:

  1. Google's server is located on the Google Fiber network in Kansas City, a mere 4 hops and 1.337ms of latency away from Google Fiber customers. This means that the Google Fiber Speed Test can more accurately measure the capability of a customer's last-mile link. (This also means it's perhaps less useful as a test for measuring access to resources outside of Kansas City.)
  2. The server is presumably provisioned well enough to handle tests from gigabit customers.
  3. Google's test opens a large number of simultaneous connections -- as many as 64 from my tcpdump observations. This may help with issues related to TCP window size, and possibly mitigate the negative effects of TCP congestion control should one of the connections miss a packet.

Network topology

Google Fiber has considerably increased their peering arrangements since my last visit. They seem to have good peering with the following networks that I noticed:

  • Level 3 - Chicago
  • XO - Chicago
  • Facebook (Rackspace ORD1) - Chicago
  • Inteliquent - Chicago
  • Kansas Research and Education Network (KanREN) - Kansas City
  • Level 3 - Dallas
  • Level 3 - Denver
  • Comcast (Equinix Great Oaks) - San Jose
  • Level 3 - San Jose
  • Amazon - San Francisco
  • Google - various

(Who knew that Facebook even had their own nationwide network? If you see addresses in your traceroutes, the tfbnw stands for "the facebook network".)

IPv6 seems to be functioning properly, according to various online testers. (I did have some issues reaching my 6to4-connected home network via IPv6, for some reason.)


The file transfer tests -- old-fashioned "move this big file from one hard drive on the network to some other hard drive" -- are probably not the best tests of a next-generation gigabit service such as Google Fiber. Nor are most other "download" applications. (What's the point of being able to download four seasons of Breaking Bad in 3 minutes, when it takes 30 hours to watch?) Ultimately, unlocking the true potential of home gigabit connections will rely on the development of new and interesting applications. I predict a lot of live media, immersive telepresence, and rich collaboration applications will arise from this experiment.

Thanks to Ben Barreth and the residents of the Hacker House for having me over!

1 2 3 4 5 Next page »