If I understand this correctly, it translates Rocq to C++? Took me several minutes to even understand what this is. Why is it called an extraction system? Who is this for?
I'm confused.
edit: I had to dig into the author's publication list:
Testing remains a fundamental practice for building confidence in
software, but it can only establish correctness over a finite set of
inputs. It cannot rule out bugs across all possible executions. To
obtain stronger guarantees, we turn to formal verification, and in
particular to certified programming techniques that allow us to de-
velop programs alongside mathematical proofs of their correctness.
However, there is a significant gap between the languages used
to write certified programs and those relied upon in production
systems. Bridging this gap is crucial for bringing the benefits of
formal verification into real-world software systems.
That's essentially correct. Extraction is a term in roqc. A rocq program contains both a computational part, and proofs about that computation, all mixed together in the type system. Extraction is the automated process of discarding the proofs and writing out the computational component to a more conventional (and probably more efficient) programming language.
The original extractor was to ocaml, and this is a new extractor to c++.
Just like JavaScript folks like calling their compilers "transpiler", proof assistants folks like calling their compilers "extraction". Essentially it's a compiler from a high-level language to a slightly lower-level, but still reasonably high-level language.
Simplifying a bit, a compiler tr(.) translates from a source language L1 to a target language L2 such that
semantics(P) == semantics(tr(P))
for all programs in L1. In contrast, and again simplifying a bit, extraction extr(.) assumes not only language L1 and L2 as above, but, at least conceptually, also corresponding specification languages S1 and S2 (aka logics). Whenever P |= phi and extr(P, phi) = (P', phi') then not just
semantics(P) == semantics(P')
as in compilation, but also
semantics(phi) = semantics(phi'),
hence P' |= phi'.
I say "at least conceptually" above, because this specificatyion is often not lowered into a different logical formalism. Instead it is implied / assumed that if the extraction mechanism was correct, then the specification could also be lowered ...
I'm not entirely sure I fully agree with this definition; it seems somewhat arbitrary to me. Where is this definition from?
My usual intuition is whether the generated code at the end needs a complicated runtime to replicate the source language's semantics. In Crane, we avoid that requirement with smart pointers, for example.
This definition is my potentially flawed attempt at summarising the essence of what program extraction is intended to do (however imperfect in practise).
I think extraction goes beyond 'mere' compilation. Otherwise we did not need to program inside an ITP. I do agree that the state-of-the-art does not really full reach this platonic ideal
I have another question, the abstract of your paper says that you "provide concurrency primitives in Rocq". But this is not really explained in the text.
What are those "concurrency primitives"?
We mean Haskell-style software transactional memory (STM). We call it a primitive because it is not defined in Rocq itself; instead, it is only exposed to the Rocq programmer through an interface.
I'm the other dev of Crane. Our current plan is to use BRiCk (https://skylabsai.github.io/BRiCk/index.html) to directly verify that the C++ implementation our STM primitives are extracted to matches the functional specification of STM. Having done that, we can then axiomatize the functional specification over our monadic, interaction tree interface and reason directly over the functional code in Rocq without needing to worry about the gritty details of the C++ interpretation.
I'm not an expert in this field, but the way I understand it is that
Choice Trees extend the ITree signature by adding a choice operator. Some variant of this:
ITrees:
CoInductive itree (E : Type -> Type) (R : Type) : Type :=
| Ret (r : R)
| Tau (t : itree E R)
| Vis {T : Type} (e : E T) (k : T -> itree E R)
ChoiceTrees:
CoInductive ctree (E : Type -> Type) (C : Type -> Type) (R : Type) : Type :=
| Ret (r : R)
| Tau (t : ctree E C R)
| Vis {T : Type} (e : E T) (k : T -> ctree E C R)
| Choice {T : Type} (c : C T) (k : T -> ctree E C R)
One can see "Choice" constructor as modelling internal non-determinism, complementing the external non-determinism that ITrees already allow with "Vis" and that arises
from interaction with the environment. (Process calculi like CCS, CSP and Pi, as well as session types and linear logic also make this distinction).
There are some issues arising from size inconsistencies (AKA Cantor's Paradox) if / when you try to fit the representation of all internal choices (this could be infinite) into a small universe of a theorem prover's inductive types. The ChoiceTree paper solves this with a specific encoding. I'm currently wondering how to port this trick from COq/Rocq to Lean4.
I'm confused.
edit: I had to dig into the author's publication list:
https://joomy.korkutblech.com/papers/crane-rocqpl26.pdf
Testing remains a fundamental practice for building confidence in software, but it can only establish correctness over a finite set of inputs. It cannot rule out bugs across all possible executions. To obtain stronger guarantees, we turn to formal verification, and in particular to certified programming techniques that allow us to de- velop programs alongside mathematical proofs of their correctness. However, there is a significant gap between the languages used to write certified programs and those relied upon in production systems. Bridging this gap is crucial for bringing the benefits of formal verification into real-world software systems.