Aeneas is a toolchain for verifying Rust programs, relying on a functional translation to Lean. Aeneas is now nearly five years old, and many excellent improvements have landed recently, thanks to the work of Son Ho, Aymeric Fromherz, Guillaume Boisseau and many other collaborators. With this blog post, I’m hoping to showcase how proofs in Aeneas look like these days, and highlight recent improvements, by relying on a sample problem given to me by my colleague Luca Versari. I think this blog post can also serve as a nice example / tutorial on how to work with Aeneas on a real-world piece of code.

Before going any further: more info about Aeneas is available on GitHub, and most communication happens on our Zulip – subscribe to the newsletter channel there if you’d like to stay updated. Finally, the science is in our ICFP’22 and ICFP’24 papers.

Background: jxl-rs

jxl-rs is a rewrite of the jpeg-xl codec, in Rust. For performance reasons, parts of the rewrite do rely on unsafe code. In this blog post, our goal is to show that this particular unchecked_get is safe, that is, that the index is always within bounds.

Aeneas does not have much support for unsafe code; but unchecked accesses such as the one above are modeled like regular accesses: one must show that the index is within bounds in order to show that the function terminates with a proper result (the .ok case in Aeneas). In other words, if we can prove that read produces an .ok, then we have shown that that the array access is always within bounds.

Running Aeneas

I packaged the proofs here as a toy project. To run Aeneas, we must first produce an LLBC file, that is, a cleaned up version of the MIR representation of the Rust compiler. This is done by invoking the Charon tool, passing it a --start-from flag to extract read (our target function), its transitive dependencies, and nothing else. See our paper for more info on Charon and LLBC.

The next step is to run Aeneas itself, to produce a Lean representation of the Rust code out of the LLBC. We pick the lean backend and generate multiple files, thus separating functions, types, and proofs.

Aeneas is tied to a particular version of Lean: we adopt the lean-toolchain provided by the Aeneas repository.

Missing functions

The first thing we notice is that the project does not compile (i.e. no lake build), because some Lean definitions are missing. This is because the corresponding Rust functions are missing from the LLBC. This happens if e.g. the functions live in a third-party crate, or if there is no corresponding --include flag passed to Charon, which would force extraction to traverse crate boundaries. The signatures of these functions are emitted by Aeneas as axioms in a Lean (file FunsExternal_template.lean), but no body is provided: Aeneas simply does not know the expected behavior of an external function.

The user can assert facts about an external definition; one easy way to do this is by providing a function body, in other words, by writing a trusted model of the external function. Here, we rename FunsExternal_template.lean into FunsExternal.lean, and fill out the function definition. We are now asserting that the external function read_u64 behaves like its model. The Aeneas-specific scalar_tac comes in handy to prove that the result is within bounds.

With that, and a little bit of boilerplate, the project builds. We create a Proofs.lean file and add it to the main JxlProofs.lean to make sure it is picked up by the build.

Background on how proofs work in Aeneas

Proofs in Aeneas/Lean rely on two key features.

The first feature is the monadic encoding of Aeneas, which translates a Rust function to a monadic Lean function, i.e. one that relies on the do-notation. See Funs.lean for examples of this. In short, monadic operations (denoted by a left-arrow ←) encode the possibility of failure, e.g. if an index is outside of bounds (line 47), or if a left-shift exceeds the bitwidth (line 49).

The second feature is the step tactic and corresponding step lemmas. To reason about what it takes for such a function to successfully execute (i.e., reach the .ok case and never the .panic case), the step tactic goes over each monadic operation (a shift, an index, etc.) and collects the required proof obligations along the way (that the bitwidth does not exceed the width, or that the index be within bounds). After running the step tactic, the proof engineer is left with a series of subgoals – successfully discharging all of these subgoals establishes that the function terminates without panics.

Invariants

Before we actually look at a full example of stepping and proving panic-freedom, we start in a bottom-up fashion and present our data structure invariants.

The Proofs.lean file (presented here in its final state) starts with invariants on our data structures. Some of those were present in English (as Rust comments), and some others were found by asking the author of the code for additional properties once I realized that my hypotheses were not strong enough to conduct my proofs.

For the sake of this blog post, we are only doing partial verification: in a real project, we would make sure the invariants are properly established at data structure-creation time, and properly maintained throughout all the code-paths that lead to the read function we are currently studying.

Helper lemmas

The next section contains a bunch of helper lemmas that I either could not find in Lean’s standard library, or that are simply missing. We decorate them with:

  • simp attributes, so that they are handled by simp without passing them explicitly
  • scalar_tac, to make them trigger automatically in calls to scalar_tac with a suitable pattern
  • grind_patterns, to make them available to grind, also with a suitable pattern – note that we add them to the Aeneas grindset (agrind) so that they participate in automatically discharging subgoals while calling step* – more on that below.

Again, those were found throughout the course of writing the proofs and this section grew incrementally – I am simply presenting the final state of the Proofs.lean file. One interesting tidbit is the case_usize helper, a particularly annoying lemma that states that a property true both in the 32- and the 64-bit cases is true for any usize. With the help of Claude (thanks Son!), we now have a quite concise and effective proof.

The theorems showcase some of the Aeneas tactics: bvify, bv_tac, scalar_tac, all fine-tuned to work well on the kind of goals produced by dealing with machine integers.

This is typically the kind of boring, mundane and tedious lemmas that one would do well to prove with AI. Personally, I find those supremely boring.

Proving the absence of panics

Let’s now move on to something slightly more interesting, and write our first specification.

Again, for the sake of example, we focus on the absence of panics. Our specification is thus going to be quite simple: we set out to prove Hoare triples of the form f ⦃ r => True ⦄, i.e. function f terminates with successful result r, no further properties stated (True).

A first fun example is read_u64_spec, presented here. It simply states that, provided that there are at least 8 bytes in the slice passed as an argument (the hypothesis h over the argument bytes), read_u64 terminates without panics. The proof script itself is quite simple: we unfold the definition, step over all of the monadic steps of read_u64 itself, and use the <;> combinator to apply the grind tactic on all the remaining subgoals. I highly recommended following along in your favorite editor to see the effects!

One key thing to note: read_u64_spec is itself decorated with @[step]. It means that callers of read_u64, when they use step themselves, will be able to leverage this theorem and generate a subgoal (“there are at least eight bytes in the slice passed to read_u64”) to show termination.

A second fun example is the refill_slow_loop_does_not_panic theorem, which proves that the function always terminates, no panics, and no further hypotheses required. This proof showcases a recent addition to Aeneas: reasoning about loops via combinators, rather than recursive functions. Here, to show loop termination, it suffices to find a decreasing measure, and because we only focus on the absence of panics, no further loop invariant is needed.

Proof of termination for other functions is remarkably mundane, and step* successfully chomps through the monadics steps of the Aeneas encoding. One thing to note is that step* will automatically try to discharge subgoals automatically: the subgoals you see after step* are those that could not be discharged automatically. This is where annotating our earlier lemmas with the correct attributes (agrind, etc.) pays off. Another interesting tidbit: the treatment of global const declarations in Rust was recently improved in Aeneas, and now with the proper attributes, one no longer needs dedicated simp [global_simps] calls to inline away constants: this now happens automatically.

Trick: introducing a helper on the fly

Because Aeneas works on Rust code as written by the programmer, it is sometimes the case that it would be beneficial to introduce a helper, to factor out commonality in the code that we are looking at.

Here, bucket_index is a conceptual helper that our target function (read) does not use; however, we wish to reason about it independently, for modularity of proofs. We write a step lemma for it, this time with an interesting post-condition! Then, we introduce a rewrite lemma that allows substituting a series of monadic operations that correspond to bucket_index with a call to bucket_index itself.

This is very helpful to make proofs more modular.

Final proof

The final proof is relatively concise. Of note: we duplicate the main invariant so that both its folded form and its unfolded form are available; and we rely on bucket_index_eq to materialize the helper in our goal. Many of the steps rely on grind, which is now officially supported by Aeneas.

Most of the subgoals are discharged automatically: the remaining ones rely on additional facts about bounds that can be proven using bv_tac.

The final two subgoals share a lot of commonality: the same variables are in scope and the same facts about those are needed in both cases. To share the reasoning, I used all_goals.

A partial proof

There are many more interesting things that could be proven about this example. Notably, the read function does not actually preserve the invariant: it only does so as long as we have not reached the end of the input stream. This would have to be reflected in its post-condition (omitted here), and we would have to check that all callers handle the end-of-stream state properly.

Conclusion

I hope this brief example provides a glimpse of how proofs typically work in Aeneas. With suitable annotations and judicious usage of grind, I believe those proofs occupy a sweet spot: they enjoy a high degree of automation, relying on SMT-like tactics (grind), or domain-specific tactics (bv_tac), while still reaping the benefits of interactivity, namely, being able to see the goal and debug why a proof is not going through.