#2 Extraction in practice : div

In this section, we illustrate the use of Coq extraction on a small yet revealing example: Euclidean division amongst natural numbers.
 For sake of simplicity, we will use the unary nat datatype for representing these natural numbers:
 every number is stored as either zero or the successor S of another number.
 Even if this representation is inherently inefficient, the discussion that follows would be quite similar with more clever coding of numbers.
 Coq's Standard Library provides basic operations and relations on nat, such as +, *, <, $\leq$.
 In Coq, logical relations do not necessarily have corresponding boolean test functions, but here a result named le_lt_dec, noted $\leq ?$ afterwards, can be used as an effective comparison function for determining whether nm or m<n for any numbers n and m.

Each Coq snippet proposed below is taken verbatim from a valid session1 with Coq 8.2, including unicode notations and other syntactic improvements.

##2.1 A division that fulfills the structural constraint

One usual algorithm for division on natural numbers is to proceed by successive subtractions: div x y = 0 when x<y and div x y = S (div (x-y) y) otherwise.
But this cannot be written directly in Coq.
 Due to the intimate relationship between proofs and programs in Coq, no Coq objects may be allowed to trigger infinite computations.
 A rather drastic constraint is hence required on recursive functions in order to ensure their termination:
 they should have at least one inductive parameter such that recursive calls are done on an immediate subterm of this parameter.
 Here, our recursive call fails this criterion, since (x-y) is not an immediate subterm of x, and second parameter y has not changed.
 Even worse, trying this algorithm with y=0 leads to an infinite computation:
 Coq's rejection is here quite legitimate.

For defining nonetheless our division in Coq, a first solution is to try to live with this structural constraint, and adapt our algorithm accordingly.
 For instance:

~~~
Fixpoint div x y := match x with
| 0 => 0
| S x' =>
  let z := div x' y in
  if (S z)*y ? x then S z else z
end.
~~~

Knowing the quotient for the predecessor of x can indeed be used to infer the quotient for x.
 But proceeding this way leads to a costly test repeated x times.
This is a common situation with Coq:
 intended algorithms can be adapted to be "structural", but this may result in an awkward and/or less efficient algorithm.

Command Extraction div can then be used to convert this division to Ocaml code:

~~~
let rec div x y =
match x with
| O -> O
| S x' ->
  let z = div x' y in
  if le_lt_dec (mult (S z) y) x then S z else z
~~~

This first extracted div highlights the fact that on basic Coq functions, extraction is mainly performing a straightforward syntactic translation.
 But even on such a simple function, some proof elimination occurs during extraction.
 In fact, comparison le_lt_dec a b is not producing a mere boolean, but rather a proof-carrying boolean type {a<=b}+{b<a}, which is an inductive type internally named sumbool, with two constructors left and right both having a proof as parameter, here respectively a proof of ab and a proof of b<a.
 Extraction removes these proofs, hence obtaining an extracted sumbool datatype with two constant constructors, isomorphic to bool.
 In order to get precisely the extracted code shown above, one could then teach Coq to take advantage of this isomorphism, via : Extract Inductive sumbool => bool [ true false ].

One should note that the proof elimination done during extraction is based on earlier declarations by the user (or by the library designer).
 Here, proof-carrying boolean {a<=b}+{b<a} is exactly isomorphic to logical disjunction ab _ b<a (instead of left and right, constructors are named or introl and or intror).
Simply, the former is declared in the logical world named Prop and is pruned during extraction whereas the latter is declared in Set, the world of Coq programs, and simply loses at extraction the logical parameters of its constructors.
Similarly, two existential types coexist in Coq: the logical one 9x:A,P x and the informative one { x:A | P x }.

##2.2 A division with an explicit counter

Let's now try to implement a function closer to our intended division algorithm, instead of the ad-hoc structural version of the last section.
 A solution is to artificially add a new structurally decreasing parameter that will control the number of allowed recursive calls. Here for instance, if y6=0, it is clear that at most x successive subtractions can occur before the algorithm stops.
 A common presentation is to separate the function to iterate div_F from the actual counterbased recursive iterator div loop. The main function div is then a simple call to div loop with the right initial counter value.

~~~
Definition div_F div x y := if y <=? x then S (div (x-y) y) else 0.

Fixpoint div_loop (n:nat) :=
  match n with
  | 0 => fun _ _ => 0
  | S n => div_F (div_loop n)
  end.

Definition div x y := div_loop x x y.
~~~

One more time, extraction is straightforward and mostly amounts to replacing Coq keywords with Ocaml ones.
 The counter, whose type is nat, is kept by the extraction, even though it is morally useless for the computation.
 At the same time, removing it and replacing div loop by an unbounded loop would change the semantics of the program at least for y=0:
 with the above definition, div 5 0 computes to 5, while a Ocaml version without counter would loop forever.
As a consequence, the extraction cannot be expected to detect and remove automatically such a "useless" parameter.

Using such an explicit counter is often an interesting compromise:
 the written Coq code is not exactly what we intended in the first place, but is close to it, there is no complex internal Coq object as with the methods we will study in the next sections, computations can be done both in Coq and after extraction, and the additional cost induced by the presence of the counter is often modest.
 Here for instance the x value would have been computed anyway.
 Another example of this technique can be found in module Numtheory of the Standard Library, where a gcd function is defined on binary numbers thanks to a counter that can be the depth (i.e. the logarithm) of these binary numbers.

##2.3 A division by general recursion, historical approach

We can in fact build a Coq div function that will produce exactly the intended algorithm after extraction.
 Before presenting the modern ways of writing such a function with two frameworks recently added to Coq, let us first mention the historical approach. 
For a long time, the only possibility has been to play with accessibility predicates and induction principles such as well founded induction.
 In this case, recursive functions do satisfy the structural constraint of Coq, not via their regular arguments, but rather via an additional logical argument expressing that some quantity is accessible.
 Recursive calls can then be done on quantities that are more easily accessible than before. 
This extra logical parameter is then meant to disappear during extraction.
 In practice, non-trivial functions are impossible to write as a whole with this approach, due to the numerous logical details to provide.
 Such functions are hence built piece by piece using Coq interactive tactics, as for proofs.
 Reasoning a posteriori on the body of such functions is also next to impossible, so key properties of these functions are to be attached to their output, via post-conditions { a:A | P a }.
Pre-conditions can also be added to restrict functions on a certain domain:
 for instance, div will be defined only for y6=0.
 Here comes the complete specification of our div and its implementation in a proof-like style:

~~~
Definition div : 8x y, y <> 0 ! { z | z*y  x < (S z)*y }.
Proof.
 induction x as [x Hrec] using (well_founded_induction lt_wf).
 intros y Hy.
 destruct (y ? x) as [Hyx|Hyx]. (* do we have yx or x<y ? *)
 (* first case: yx *)
 assert (Hxy : x-y < x) by omega.
 destruct (Hrec (x-y) Hxy y Hy) as [z Hz]. (* ie: let z = div (x-y) y *)
 exists (S z); simpl in *; omega. (* ie: z+1 fits as (div x y) *)
 (* second case: x<y *)
 exists 0; omega.
Defined.
~~~

We use lt_wf, which states that < is well-founded on natural numbers.
 When combined with well founded induction, this allows us to perform recursive calls at will on any strictly smaller numbers.
 Doing such a recursive call can be quite cumbersome:
 for calling Hrec on x-y, we need to have already built a proof Hxy stating that x-y < x.
 Without additional help such as comments, it is also very tedious to keep track on the algorithm used in such a proof.
 Fortunately, extraction can do it for us:

~~~
let rec div x y =
  if le_lt_dec y x then S (div (minus x y) y) else O
~~~

##2.4 A division by general recursion with the Russell framework

The function-as-proof paradigm of the last section can be used on a relatively large scale, see for instance union and the few other non-structural operations on well-balanced trees in early versions of module FSetAVL in the Standard Library.
But such Coq functions are hardly readable and maintainable, consume lots of resources during their definitions and in practice almost always fail to compute in Coq.

Recent versions of Coq include Russell, a framework due to M. Sozeau [10] that greatly eases the design of general recursive and/or dependently-typed functions.
With this framework, bodies of functions can be written without being bothered by proof parts or by structural constraints.
 Simply, such definitions are fully accepted by Coq only when some corresponding proof obligations have been proved later on.
 For instance:

~~~
Definition id (n:nat) := n.

Program Fixpoint div (x:nat)(y:nat | y <> 0) { measure id x }
  : { z | z*y  x < (S z)*y }
  := if y <=? x then S (div (x-y) y) else 0.

Next Obligation. (* Measure decreases on recursive call : x-y < x *)
 unfold id; simpl; omega.
Qed.

Next Obligation. (* Post-condition enforcement : z*y  x < (S z)*y *)
 destruct_call div; simpl in *; omega.
Qed.
~~~

After this definition and the proofs of corresponding obligations, a Coq object div is added to the environment, mixing the pure algorithm and the logical obligations.
 This object is similar to the dependently-typed div of the previous section, and its extraction produces the very same Ocaml code.

Russell framework can be seen as a sort of anti-extraction, in the spirit of C. Parent's earlier works [9].
 Even if it is still considered as experimental, it is already quite usable.
 For instance, we have a version of FSetAVL where the aforementioned non-structural operations on well-balanced trees are written and proved using Russell.

##2.5 A division by general recursion with the Function framework
An alternative framework can also be used to define our div function:
 Function, due to J. Forest and alii [4].
 It is similar to Russell to some extent:
 algorithms can be written in a natural way, while proof obligations may have to be solved afterwards.
 Here, as for Russell, these proof obligations are trivial:

~~~
Function div (x y:nat)(Hy: y <> 0) { measure id x } : nat :=
   if y <=? x then S (div (x-y) y Hy) else 0.
Proof.
 intros; unfold id; omega.
Defined.
~~~

Moreover, as for Russell, this framework builds complex internal Coq objects, and extraction of these objects produces back precisely the expected code.
 But unlike Russell, Function is not meant to manipulate dependent types:
 in particular the y6=0 pre-condition is possible here only since it is passed unmodified to the recursive call.
 On the contrary, Function focuses on the ease of reasoning upon functions defined with it, see for instance the functional induction tactics, allowing to prove separately properties of div that would have been postconditions with Russell.
 Once again, the sensitive operations on well-balanced trees have be successfully tried and defined using Function.