Caffeinated Bitstream

Bits, bytes, and words.

Reinventing software for security; or: The woodpeckers are coming.

The past couple of years have been tough for digital security. A few disasters and near-disasters include:

  • Heartbleed, a buffer over-read vulnerability in OpenSSL allowing unauthorized remote access to data which may contain private keys.
  • Shellshock, an issue with Bash allowing remote code execution in many varied scenarios.
  • A bug in Microsoft's SSL/TLS library (Schannel) allowing remote code execution.
  • POODLE, a flaw in the SSLv3 protocol that an attacker can leverage on many connections by forcing a protocol downgrade, or relying on certain flaws in TLS implementations.
  • Attackers' increasing boldness in targeting networks for financial gain (Target, Home Depot) or cybervandalism (Sony Pictures), resulting in hundreds of millions — or perhaps even billions — of dollars in damages.
  • A rising awareness of state-sponsored attacks, from actors such as the NSA (Regin malware), the UK's GCHQ (Belgacom attack), and North Korea (alleged perpetrator of the Sony Pictures attack).

How did our infrastructure become so fragile? How did the miracles of technology turn against us? Who is responsible for this? Regrettably, my fellow software engineers and I are largely responsible. Together, we have created this frightening new world where people's property, finances, and privacy are at risk.

“If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.” — Gerald Weinberg

Weinberg's famous quote about software quality points out a lack of rigor that can be seen in the software industry for decades. In the 1970's and 1980's, the nascent Internet was a more civilized place, similar to a small town where people felt comfortable leaving their front doors unlocked. Accordingly, we built software with little consideration of security. Unencrypted communication protocols like telnet would happily share your passwords with any eavesdropper, and lax security in other network services would eventually expose unexpected attack modes that were perhaps obvious only in hindsight. In the 1990's and 2000's, we wised up with better encryption, authentication, authorization, and recognition of security as an explicit engineering goal. (As far as I can tell, the first RFC with a dedicated “Security Considerations” section was RFC 1060 from March 1990.)

However, although we managed to lock the front door, we left our systems vulnerable in many other ways. Memory safety errors, unexpected consequences emerging from complexity, and numerous mundane code correctness issues provided attackers with a seemingly endless toolkit for compromising systems.

Many other engineering disciplines have the benefit of hundreds or thousands of years of accumulated wisdom that have resulted in highly refined tools and methods. Designing bridges or buildings, for example, is a well-understood process. We've only been developing software for about 60 years, and only been developing software at a large scale for maybe 30-40 years. Our field is very much still in its infancy: our tools are sorely lacking, our methods tend to be ad-hoc, and lack of experience leads us to be overconfident in our ability to produce correct code. Our products often fail to provide the basic functions expected by the user, much less withstand attacks by a thinking, creative adversary. It pains me that we've let down our employers, customers, and users by producing such flawed products.

Software development must reinvented. We need better tools and methods to build more reliable software, and an environment that values security and rewards engineers and companies for producing such software. These things are easier said than done, and I don't have all the solutions. I do know that it's time to start working on solutions. The threat level is not going down any time soon. In fact, I expect it to rise with our increased reliance on software systems and as recent high-profile attacks show the world's miscreants just how vulnerable we are.

The woodpeckers are coming.

Limitations of defensive technology

The industry's solution is to double down on defensive technology — malware scanners, firewalls, intrusion detection appliances, and similar systems. While these play an important role, it is increasingly difficult for defensive systems to shoulder the entire burden of security while an army of software engineers continues to supply a never-ending fountain of vulnerabilities. Firewalls become less effective as more software integrates firewall-bypassing communication channels with cloud services, attackers seek to exploit flaws in such software, and malware is distributed out-of-band. Malware scanners especially face tough challenges as fully metamorphic viruses are already extremely difficult to detect, and likely have a lot more opportunities for improvement than the scanners have options for improving detection.

Ultimately, software engineers are able to create security problems much faster than producers of defensive products can figure out ways to contain them. We must stop thinking of security in terms of band-aids, and address the source of the problem by developing software that is secure by design.

Attacking attack vectors with better tools and methods

We can broadly divide the attack universe into two categories:

  • Software engineering attack vectors. This includes programming issues such as memory safety and code correctness, and system design issues dealing with authentication schemes, cryptosystems, protocols, complexity management, and the user experience.
  • Other attack vectors found in system administration, configuration, networking, wiring, physical side channel emissions, passwords, social engineering, operational security, and physical security.

As a software engineer interested in improving software engineering, I'm focused on the former category. Examining a few of the recent high-profile vulnerabilities is useful for thinking about how we can approach certain attack vector categories.

Heartbleed and memory safety

“Whenever I go to debian.org and look at the latest security fixes, the vast majority of them involve memory safety issues, which only appear in unsafe languages such as C and C++.”
user54609, Information Security Stack Exchange

Memory safety issues are behind a huge chunk of vulnerabilities, such as OpenSSL's Heartbleed. Much security-sensitive code is written in low-level languages because we seek performance, minimal memory footprint, minimal dependencies, interoperability, and sometimes fine-grain control over execution. This is especially true for cryptography, where we'd like the CPU overhead to be as close to zero as possible, and avoid potential timing attacks that could arise from high-level language execution. However, developing complex systems in C and C++ can require a superhuman level of attention to detail to avoid memory errors, and even the most capable programmers seem to let such errors slip through on occasion. Although techniques exist to help minimize such errors (e.g. C++ smart pointers), it may not be possible to develop a large, complex C/C++ program with a high assurance of correct memory usage.

Fortunately, there has been much interest lately in developing new low-level languages with memory safety assurances. My favorite of these is currently Rust, which promises zero-cost memory safety by requiring that the programmer adhere to a certain memory management discipline. Rust is the most promising step toward reinventing software that I see today. If our critical low-level infrastructure was written in Rust instead of C/C++, we would be far more secure. Heartbleed would not have happened if OpenSSL was written in Rust.

Rust is still a work in progress, can be difficult to use, and even a fully mature Rust may not be the final solution. Other new languages also have merit. The Go programming language looks promising and is quite a bit more mature than Rust. However, Go's mandatory garbage collection may exclude it from certain applications, such as operating system kernels, real-time tasks, or possibly cryptography. (It's not clear to me if garbage collection can contribute to timing side channels in cipher implementations. I'd love to see some research on this.)

When it comes to memory safety bugs, the path ahead is refreshingly clear: new high-performance, low-level programming languages that prevent these bugs from happening. Unfortunately, general solutions for other classes of bugs remain murky.

Shellshock and emergent vulnerabilities

"So who's to blame? Everybody and nobody. The system is so complex that unwanted behaviours like these emerge by themselves, as a result of the way the components are connected and interact together. There is no single master architect that could've anticipated and guarded against this."
Senko Rasic on Shellshock

The Shellshock vulnerability in Bash is a great example for reminding us that some threats can be created even with the most logically consistent and memory-safe code. Writing Bash in a rigorous language such as Rust would not have prevented Shellshock from happening, nor would any amount of static analysis have revealed the problem. Shellshock arises from a feature added to Bash in 1992 for passing shell functions to child Bash processes using environment variables. The feature seems to be implemented by passing the environment variable's value directly to Bash's interpreter, as commands provided after the close of the function definition will be parsed and executed immediately. This probably seemed like a reasonable feature in 1992, but it became a devastating vulnerability when Bash became the glue tying network services to scripts (e.g. web servers to CGI scripts, or DHCP clients to hook scripts), and environment variables could suddenly contain hostile payloads, thus providing remote code execution to external parties.

It would have been nice if the troublesome feature halted interpretation at the end of the function definition, but even provisioning functions from environment variables was something that network service developers could not have anticipated. Indeed, they probably didn't anticipate the use of Bash at all — they were merely passing data to a child process in a generic fashion, and the use of Bash was often simply a result of how the system administrator or the distribution maintainer connected the pieces. Thus, Shellshock falls into an elusive category of emergent vulnerabilities that can arise in complex systems.

This class of vulnerability is particularly disturbing since most software is built around the idea of reusable modules of code, many of which may be supplied by external vendors, and connected in a vast number of combinations. We need engineering methods for dealing with this complexity, but I'm not sure exactly what these would be. Perhaps interface definitions between software components could make formal guarantees about how the passed data will be used.

Apple's “goto fail” bug and code correctness

Apple's goto fail bug, revealed in February 2014, prevented signature verification from happening properly in TLS handshakes, thus allowing man-in-the-middle attacks. The cause of the bug was a minor typo in the source code which led to unintended behavior. The program was incorrect — its behavior did not match its specification. Incorrect code can be produced by even the very best programmers, since these programmers are human beings and will occasionally make human mistakes.

Mike Bland believes that the “goto fail” bug could have been avoided by promoting a unit test culture, and Adam Langley suggests code reviews. These are both great ideas, especially for such critical code. However, I wonder if there are ways we can avoid creating these errors to begin with, instead of hoping to catch them later in a mop-up phase. Would use of functional languages like Haskell help us better express our intentions? Could formal methods and formal specifications be useful for catching such implementation errors?

POODLE and the trouble with cryptographic protocols and implementations

The POODLE attack revealed in September 2014 allows attackers to target secure connections protected with correct SSL 3.0 implementations, or TLS implementations with certain coding errors. (Although SSL 3.0 is 18 years old and seldom used in normal operation, this is still quite concerning as an attacker can use a forced downgrade attack to cause an SSL 3.0 session to be negotiated.) This reminds us that bugs can exist in protocols themselves, and cryptography can be enormously difficult to implement correctly. It's not good enough for cryptography implementations to properly encode and decode — to be secure, they must be mindful to a long list of small details involving parsing, padding, execution time (to avoid timing side channels), proper use of random number generators, and many more.

The best bits of advice I've heard about implementing cryptography are:

  • Practice extreme humility — overconfidence is the enemy of security. Know that no matter how good you are, your fresh cryptographic code is likely to have subtle problems.
  • Reuse existing cryptographic code modules whenever possible, preferably modules that have been audited, rigorously tested, and battle-hardened through their production use. As full of holes as OpenSSL is thought to be, it is probably more secure than whatever you would write to replace it. Better yet, consider opinionated toolkits such as the Sodium crypto library.
  • Seek expert assistance from professional cryptographers and security experts, when possible. There are people out there who have made it their life's work to study cryptography and its practical use, although they are probably not cheap.
  • Commission third-party security audits. When we programmers look at the same body of code for weeks at a time, we often lose the ability to view it critically. Fresh eyes can be invaluable.

The best engineering improvement I can think of is the use of domain-specific languages to specify protocols and algorithms, as this may help avoid the pitfalls of implementing cryptography in general purpose languages. I'm encouraged by projects such as Nick Mathewson's Trunnel, a binary parser generator for protocols.

Economics of secure software

“It's a valid business decision to accept the risk [of a security breach]... I will not invest $10 million to avoid a possible $1 million loss.”
— Jason Spaltro, senior vice president of information security, Sony Pictures, in a 2007 interview with CIO.

From individual consumers to the largest companies, security often seems to be valued rather low. Mr. Spaltro's unfortunate cost-benefit analysis has been mentioned often in the days since the devastating Sony Pictures attack was made public. However, I doubt his thinking was too far out of line with others at the time. In most organizations, information technology is a cost center that does not directly contribute to the bottom line, so it's understandable that companies would seek to minimize its expense. There is probably considerable temptation to underestimate the cost of breaches. This is regrettable, as even with improved engineering tools and methods, the financial investment needed to develop, audit, and deploy improved software may be quite large. I suspect companies such as Sony, Target, and Home Depot now have a better understanding of risks and may be willing to invest more money into security. Hopefully some of their security budget will include software better engineered for security, whether supplied by external vendors or developed in-house. In the end, it may take hundreds of billions or even trillions of dollars to rebuild our software foundations.

One great puzzle is figuring out how to fund the development and auditing of open-source software. Much of the technology we use every day relies on various open-source software modules under the hood, and our security relies on these modules being reliable. Additionally, the inherent auditability of open-source software makes it important for resisting attempts by governments to weaken security by coercing companies to include intentional flaws in their software. Of course, simply being open-source does not automatically make software more trustworthy. Being open-source is necessary but not sufficient. There is not an army of bored software engineers browsing through GitHub projects looking for flaws because they think it's a fun way to spend a Saturday night. With the right funding, though, we can pay qualified experts to conduct thorough audits.

I'm highly encouraged by the efforts of several groups to help fund audits and other security investigations, whether their motivations arise from their reliance on the security of the targeted software, positive public relations, self-promotion, or something else entirely. For example, the Open Crypto Audit Project is funding the necessary auditing of critical open-source projects. Although their visible efforts to date have been limited to a crowdfunded audit of TrueCrypt, Kenneth White spoke at last summer's DEFCON about their intention to begin an audit of OpenSSL funded by the Linux Foundation's Core Infrastructure Initiative, which itself is funded by a long list of big names such as Google, Intel, Microsoft, and Amazon. Such investment from stakeholders to fund security audits seems like a very reasonable approach. Likewise, Google's Project Zero is a team of security researchers tasked with improving the security of all commonly used software. Even some security consultancies are finding the time for pro bono investigations, such as with the Cryptography Services effort.

I'm optimistic about the improvement of many classes of software being driven by increased demand from businesses. Selling end users on the idea of paying for security may be a much tougher challenge in a market dominated by free advertiser-sponsored software and services (e.g. mobile apps, popular web sites, etc.). We have much more work ahead of us to construct a workable value proposition for this market.

Conclusion

Looking at the current state of software security and the harm of recent attacks can be a bit of a downer, but I remain optimistic that we can fix many of the problems with better engineering and better funding. What can we do to push the state of software engineering forward and create a more secure world?

  • Study new programming languages built for better memory safety without sacrificing high performance. Think about which critical software modules might be best suited for implementation in these languages, and which can be implemented in high-level languages. If you must use C++, learn the latest techniques for helping improve memory safety.
  • Develop new abstractions that may improve software reliability, such as generators for protocol handlers and cryptography algorithms.
  • Think about engineering methods that may improve code correctness, and how they can be applied to existing software development processes.
  • Develop funding mechanisms and more compelling end-user value propositions so that software engineers working on better security can be rewarded by those who value it.

I'd love to hear about any ideas you may have about making the world's software infrastructure more resilient to attack.