Memory Safety is a Broken Idea

Penned on the 2nd day of January, 2021. It was a Saturday.

The concept of memory safety is fundamentally broken and is doing untold damage to the efficacy of software engineering. This short essay breaks down what exactly went wrong with it and where, and ties it back to the real world using C and Rust code as evidence towards this.

To understand why memory safety is a broken idea, it must first be set out what it is. According to Mozilla, memory safety basically means that accessing invalid memory is impossible.

When we talk about building secure applications, we often focus on memory safety. Informally, this means that in all possible executions of a program, there is no access to invalid memory.
Memory Safety, Mozilla Hacks

The website also references a formal definition outlined in a lengthy paper written by Arthur Azevedo de Amorim, Cătălin Hrițcu, and Benjamin C. Pierce. To summarise, it uses separation logic to create a “non-interference property […] ensuring that code cannot affect or be affected by unreachable memory” (page 2).

The two variants have complementary strengths and weaknesses: while the original rule applies to unsafe settings like C, but requires comprehensively verifying individual memory accesses, our variant does not require proving that every access is correct, but demands a stronger notion of separation between memory regions.
The Meaning of Memory Safety

The problem with memory safety in Rust began with the “first principles” detailed in this paper. It is conceptually erroneous insofar as it ideates memory as something animate or acting, when as a matter of fact it most certainly is not. R. Fabian said it perfectly in is book Data-Oriented Design: “Data is all we have.”

On the more practical fronts of implementation, this conceptual error rears its ugly head in the form of the unsafe keyword. This is a fact about Rust its proponents often neglect to mention: Rust is memory-safe, until it isn’t. With a feature like an opt-out switch, the claim that Rust is a memory safe language is easily falsified, and so exploited in the wild. Even Rust’s own standard library was vulnerable to remote code execution attacks for two years.

Every language library is going to have bugs, including exploitable ones. Still, it is absurd to suggest that such bugs are somehow impossible with a certain library because the author claims it is written in Rust, a supposedly “memory safe language”. It is abundantly clear by the public body of Rust code available that Rust is no more safe than anything else. This is a catastrophic failure of the language designer’s core intention, and to the extent it has been adopted, a failure of the field of software engineering.

One situation commonly attested to in comparisons of Rust with C or C++ is the instance where C or C++ code takes advantage of its lack of bounds-checking or other runtime validations in order to be performant. This is then followed by the counterexample of a Rust program applying its borrow-checker to ensure that it is impossible for the program to access out of bounds, using the default safe subset of the language that automatically inserts the appropriate checks for such things, incurring a performance penalty. To solve this conundrum of performance and correctness, Rust provides an unsafe keyword that causes it not to emit such checks, and the reader is told to be super careful with such code. (Of course, they aren’t.) In other words, Rust hasn’t solved the core problem of memory validitation; instead it provides an abstraction functionally equivalent to a garbage collector, with all compile-time guarantees confined to metaprogramming abstractions and the rest relegated to runtime.

To solve this problem, a much closer look must be taken at what exactly is done in such C or C++ programs to be so performant yet well-behaved. In short, they are correct! How could they be? Well, the programmer did their best to ensure that the program state remained valid as thoroughly as possible. If a program always has valid state, it is correct. In languages like C, this is achieved by sheer raw hygiene, and in C++, more plain luck. The correctness of the program is the key property here.

Notice how the C program never needed to be safe to be correct and therefore bug-free. The notion did it no favours at all, being written in a language to which the concept of safety is alien. Programs need to be correct in order to be well-behaved, and the way to achieve that is to check for correctness as intelligently and as rigourously as possible. “Safety” didn’t get people to the moon, nor did it win any wars. If software engineering is ever to be taken seriously outside of entertainment and personal computing (e.g. Tesla, the military), it needs to start taking correctness a lot more seriously.

This essay outlined the problem with memory safety in programming. For the solution involving correctness, please read my other essay titled Law & Order in Software Engineering with C*.

Until next time,
Άλέξανδερ Νιχολί