Rust verification and backwards compatibility
Over the past few years, a lot of eyes in the software verification community have turned towards Rust. That’s hardly a surprise: programs written in Rust are easier to verify, owing to the language’s strong ownership discipline; the absence of undefined behaviors; and its strong notion of value. As a result, we now have a plethora of tools for Rust verification: Creusot, Prusti, Verus, and of course our very own Aeneas… not to mention tools like Kani, Gillian-Rust, and many more. In short, 2024, I believe, shall be the year of Rust verification! (And the year of Linux on the desktop, too, of course.)
This is all great and exciting, but in practice, the transition to an all-Rust ecosystem will take time and needs to happen in a gradual fashion. Neither practitioners nor researchers are going to switch entire systems, codebases and toolchains to Rust overnight. Specifically, two transitions need to happen: existing verified codebases need to be ported to Rust, and new verified Rust code needs a backwards-compatibility story for those users who can’t unconditionally adopt Rust just yet.
This blog post introduces two new projects that address the transition challenges above. The first project allows compiling the HACL* verified crypto library to safe Rust (instead of C), thus bringing old code into the new ecosystem. The second project, named Eurydice, compiles Rust to C (yes, you read that right), in order to bring new code into old ecosystems. The two projects are complementary, and connect in both directions our old, legacy toolchains with the new world in which code is authored in Rust directly.
Modernizing old codebases
HACL*, everyone’s favorite crypto library (or so I’m told), currently amounts to 160k lines of verified code. As Churchill famously said, that’s a lot of code, toil, tears and sweat. Simplifying a bit, HACL* is written in a subset of F* called Low*, for low-level F*. Low* models C memory, C concepts (machine integers, loops), and programs written in the Low* subset can be compiled to C via a dedicated compiler called KaRaMeL (“K&R meets ML”).
In spite of all the success of HACL* (parts of which have been integrated into Linux, Python, Firefox, and many more), there are two fundamental limitations to the way HACL* is currently authored.
- First, reasoning about C memory is hard, and a lot of time is wasted on mundane, boring memory reasoning, such as “these pointers are not aliased”, or “the footprint of this data structure does not overlap with that other piece of state on the heap”.
- Second, generated code remains, ultimately, unpalatable to downstream consumers, no matter how much effort you put in the quality of your parentheses.
Perhaps you, the ever optimistic reader, are thinking that it’s fine, and that these issues will be addressed for the next big project, such as verified MLS. I wish I could share your enthusiasm: if there’s one message that emerged loud and clear of the HACS workshop series, it’s that users want a pure Rust crypto library – no bindings, just safe Rust!
Perhaps you, still the optimistic reader, might think that we just designed a new toolchain, Aeneas, that aims to address these issues once and for all, and not just for crypto. Indeed, by taking Rust code written by programmers, Aeneas avoids the unsavory code-generation step; by doing a pure translation, Aeneas relieves the programmer of un-necessary memory reasoning.
But sadly, there is no world in which we have the resources, manpower and justification to rewrite all of HACL* in Rust/Aeneas. This leaves us with one option: tweak the code-generation to produce Rust code instead of C. This is exactly what Aymeric and I have been up to. The new project is called hacl-rs, and encompasses changes to both HACL* and KaRaMeL.
The translation, for the time being, can be described as follows:
- it happens after all of the compilation passes of KaRaMeL, such as monomorphization, data type elimination, and so on – eventually, we want to leverage Rust’s support for these features, but for now, the priority is to get working code
- Low*’s arrays (a.k.a. the
buffer
type) compile to mutable borrowed slices; Low*’s const arrays (a.k.a. theconst_buffer
type) compile to shared borrowed slices - as an optimization, stack allocations are initially given a Rust array type, and a type-directed translation inserts a borrow to turn it into a slice when the context expects it
- the Rust backend of KaRaMeL does a better job at choosing names and laying out files in a way that is amenable to easy cargo compilation.
Naturally, generating Rust out of Low* is not as trivial as adding a new backend to the existing KaRaMeL compiler.
The first series of problems arises on the HACL* side, which has a slightly irritating (but totally justified) pattern of “maybe in place” functions. Those functions, once compiled to Rust, are certain to trip the borrow-checker. To make things concrete, consider a simplified, albeit representative example: the addition function for fixed-size, 256-bit bignums, as expressed in Low*.
(* 256-bit bignums are represented as an array of 8 32-bit unsigned integers *)
let bignum256_t = buffer uint32_t 8
(* addition takes a pre-allocated destination bignum, and returns the carry as a
* uint32_t; the Stack annotation means that the function does not
* heap-allocate *)
val bn_add: dst:bignum256_t -> x:bignum256_t -> y:bignum256_t -> Stack uint32_t
(fun h0 ->
disjoint_or_equal dst x /\ (* <-- IMPORTANT BIT *)
disjoint x y /\
disjoint dst y)
(fun h0 r h1 ->
modifies dst h0 h1 /\
let dst_nat = as_nat h1 dst in
let x_nat = as_nat h0 x in
let y_nat = as_nat h0 y in
dst_nat == (x_nat + y_nat) % 2^256 /\
r == (x_nat + y_nat) >> 256)
let ladder ... =
...
bn_add dst foo bar;
bn_add dst dst baz;
...
This signature allows callers to potentially pass the same argument for both
x
and dst
. The reasoning behind this signature is that it allows for an
efficient, in-place series of operations where the same bignum is modified
through a sequence of operations. Alas, this is exactly the sort of pattern that
Rust disallows! The function would compile and type-check per our translation
scheme, but would implicity require its arguments to be disjoint. This means
that any call-site that passes aliased arguments will generate a borrow error.
There are ways to work around this, such as using interior mutability, but this
would be very inefficient.
Instead, there is a systematic rewriting pattern that works quite well, leveraging F*’s compile-time reduction facilities:
(* New wrapper around bn_add, for when we know dst and x are aliased *)
inline_for_extraction (* <-- IMPORTANT BIT *)
let bn_add_aliased (dst: ...) x y: ... = (* same signature as before *)
let x_copy = copy x in
bn_add dst x_copy y
let ladder ... =
...
bn_add dst foo bar;
bn_add_aliased dst dst baz;
...
The bn_add_aliased
function sports the exact same signature as its non-aliased
counterpart, meaning that call-sites remain unchanged, something we are adamant
about, since proofs at those call-sites might be fragile and not easily fixable.
Superficially, the bn_add_aliased
function still violates, once translated,
the laws of Rust’s borrow-checker. Fortunately, the inline_for_extraction
keyword means that F* reduces this definition, leaving at call-site:
let ladder ... =
...
bn_add dst foo bar;
let dst_copy = copy dst in
bn_add dst_copy dst baz
...
As far as the Rust borrow-checker is concerned, this is perfectly fine.
The second series of problems concerns code-generation, that is, KaRaMeL’s
compilation scheme. Consider the following program, which is fine in Low* (c
comes before b
, intentionally):
let ladder ... (abcd: array uint32 32) = (* four bignums side-by-side *)
let a = abcd + 0 in
let c = abcd + 16 in
let b = abcd + 8 in
let d = abcd + 24 in
...
One can’t perform arbitrary pointer arithmetic in Rust! The only primitive
available is split_at
(or split_at_mut
, depending). For these cases, we
perform a little bit of static analysis on the fly and record the history of
pointer arithmetic operations with base abcd
in a tree.
After the first operation, we record that abcd
was split at index 0.
LOW* (SOURCE) RUST (DESTINATION) TREE
let a = abcd+0 let r_a = abcd.split_at_mut(0) a @ 0
At this stage, Rust’s r_a
is a pair of slices, meaning a reference to a
in
the source code compiles to r_a.1
in Rust.
LOW* (SOURCE) RUST (DESTINATION) TREE
let a = abcd+0 let r_a = abcd.split_at_mut(0) a @ 0
let c = abcd+16 let r_c = r_a.1.split_at_mut(16) \
c @ 16
We keep going, extending the tree to keep track of the relationships between
variables. Above, in the generated Rust, r_c
is found in the right component
of the pair r_a
, which it splits at index 16
. At that program point, a
reference to a
becomes r_c.0
, while a reference to c
becomes r_c.1
. In
other words, we assume that the intent of the programmer is that the slices be
non-overlapping. This static analysis continues, and eventually yields:
LOW* (SOURCE) RUST (DESTINATION) TREE
let a = abcd+0 let r_a = abcd.split_at_mut(0) a @ 0
let c = abcd+16 let r_c = r_a.1.split_at_mut(16) \
let b = abcd+8 let r_b = r_c.0.split_at_mut(8) c @ 16
let d = abcd+24 let r_d = r_c.1.split_at_mut(8) / \
b @ 8 d @ 8
Eventually, a
, b
, c
and d
compile to r_b.0
, r_b.1
, r_d.0
and
r_d.1
respectively.
This optimistic compilation scheme, as it turns out, can handle a very large amount of cases in HACL* without requiring any modifications to the source, which is a huge win for us! Of course, this only works because the crypto code that lives in HACL* has a very specific shape and doesn’t perform general-purpose pointer arithmetic.
This scheme has several drawbacks.
- It cannot detect the case of actually overlapping slices, because pointer arithmetic operations do not come with the length of the sub-array (technically it’s erased by the time it reaches KaRaMeL). This means that there might be out-of-bounds runtime errors in the generated Rust. This, naturally, is an unpleasant property, and we plan to address it by changing the Low* extraction scheme to retain the length of subarrays. In the meanwhile, we seems to be lucky, as no such cases appear to happen in the source HACL*.
- The scheme needs to be extended in case indices are not statically-known
integers; in practice, our code can account for symbolic terms, which may be
compared (e.g.
e < e + 8
) and thus suitably related to one another in the tree. If two indices cannot be compared, the code makes the assumption that the user is splitting the array into chunks going left to right. Again, this is an unpleasant source of imprecision and we would like to rewrite the source code to get rid of all these cases. - In addition, our scheme needs to deal with the (frequent) case where the code
restarts pointer arithmetic off of the base, with difference indices. For
instance, our earlier example might be followed by
let ab = abcd + 0; let cd = abcd + 16
, meaning we need to discard the previous tree and restart the static analysis. Support for this has been recently implemented. - Finally, we need to detect the intractable case where two different uses are
interleaved (e.g. use
a
andb
, useab
, thena
andb
again). Sadly, this happens in HACL*, but should be easy to flag as an error in the code-gen before we produce Rust code.
What about Vale?
Some readers might remember that the HACL* repository also hosts the Vale
algorithms, written in a deeply-embedded Intel x64 assembly DSL. Those have
their own compiler (more of a printer, really), which generates either assembly
files (.S
or .asm
), or inline assembly headers, i.e., C headers with static
inline
functions containing __asm__
blocks, a GCC-ism that allows writing
assembly directly within C
code.
Our plan for those, currently implemented by Aymeric, is to retarget the printer
to generate Rust inline assembly syntax. This means that algorithms like
Curve25519’s 64-bit Intel ADX version, instead of generating a mixture of C and
inline ASM (as is currently done in HACL*), will generate a mixture of Rust and
Rust inline ASM. Rust and C inline assembly share many similarities; however,
Rust imposes a few additional restrictions, such as forbidding the use of the
rbx
register for input and output operands which will require small tweaks to
the verified Vale assembly.
Catering to legacy environments
As I mentioned above, Aeneas is the future! Crypto algorithms will be written in Rust by the programmer, and verification elves will confirm that the code exhibits all the required properties. But for all the excitement around Rust, a large variety of contexts still require the use of a legacy toolchain. Your project might be catering to vintage Unices from the 80s; or perhaps targets an embedded environment for which only proprietary C toolchains are available; or your management simply hasn’t reached enlightenment yet.
In spite of all of that, we still want to verify Rust, because it’s so much easier to work on a functional model rather than stateful pointer-wielding code. In order to get the best of both worlds, the Aeneas family is getting a new Greek-named tool: Eurydice. Eurydice connects to Charon, the same Rust compiler plug-in and frontend that Aeneas uses. Leveraging KaRaMeL as a library, Eurydice compiles MIR down to C.
The first challenge is to compile MIR into KaRaMeL’s internal AST. MIR is very
regular, and has a clear distinction between arrays, which represent storage
space, and pointers, which represent only a single word. The C language, on the
other hand, is much murkier when it comes to semantics: there is no way to talk
about the contents of an array, and there are no rvalues of array types –
even though x
may have type int[4]
, x
as an rvalue
always contains an
implicit address-of operator, something which is explicit in MIR. Resolving that
discrepancy requires a type-directed translation step, which happens first in
the compilation pipeline.
Then, there are other semantic discrepancies – as I mentioned above, C arrays always decay, meaning they cannot be returned by value: one uses an “outparam” instead. Conversely, in Rust, one can very naturally return an array. This means that array-returning Rust functions need to be rewritten into outparam-taking C functions, unless the array is contained within a struct, in which case C changes its mind and lets you pass the whole thing by value.
If anything, Eurydice has given me much greater appreciation for Rust’s clean semantics, and a lot more bitterness about the state of the software industry, seeing that C has been the de facto solution for so long. But I digress.
The rest of the 30 or so nano compilation passes in Eurydice are about code quality and compiling away Rust features (function and data type polymorphism, assigning arrays by value) that C does not support. Of those, half were around found in KaRaMeL (or could be reused with minimal tweaks), and half are written specifically for Eurydice.
One interesting tidbit was the compilation of slices: they compile to a C struct
(defined by hand) containing base pointer (of type void *
) and length. Every
slice-using function is naturally polymorphic in Rust; I added support in
KaRaMeL for polymorphic opaque functions, which compile in C as a function call
receiving the type argument, intended to be implemented by hand as a macro. This
means that a Rust function like copy_from_slice
is compiled generically by
Eurydice, then implemented by hand as follows:
#define core_slice___Slice_T___copy_from_slice(dst, src, t) memcpy(dst.ptr, src.ptr, dst.len * sizeof(t))
Similarly, indexing a slice with a range is compiled generically and can be
hand-implemented as a macro that receives a slice s
, a range r
and a type
t
:
#define Eurydice_slice_subslice(s, r, t) \
((Eurydice_slice){ .ptr = (void*)((t*)s.ptr + r.start), .len = r.end - r.start)
Note that we rely on C11 struct literals, so that we can have a struct as a value, and that the type is essential here in order to perform correct pointer arithmetic on a well-typed pointer.
Status
Currently, about a dozen modules from HACL-rs compile to Rust, and some like Curve25519 or Chacha-Poly have successfully been run and tested. This remains preliminary work and we hope to have a non-trivial amount of algorithms running and passing test vectors very soon.
For the other direction, we have successfully extracted and compiled a complete implementation of Kyber from Rust to C, with only minimal rewrites related to ongoing support for traits.
Performance
To everyone’s great surprise, performance has not been a concern in either direction. When compiling HACL* to Rust, the performance is within 2% of the original C, and merging those modifications that rewrite alias-taking functions sometimes even improves performance of the C code by a little bit! When compiling Kyber to C via Eurydice, there is no measurable performance difference – intuitively, it probably all looks the same to LLVM.
Wrap up
The future is bright for Rust verification, and it looks like we have a plan for transitioning our ecosystems to Rust. Join the effort, and hit us up if you’d like to help out!