A certificate-based conversion checking algorithm for Rocq?

Rocq uses an algorithm with heuristics to test convertibility of terms. While these heuristics work fine for many problems, some simple convertibility problems can take forever. Usually, these problems are between convertible terms obtained using a reduction tactic. We propose to add a convertibility checker that takes a convertibility certificate to speed up convertibility checks in such cases.

Display all goals and responses

A proof that passes can take forever at Qed time because of convertibility tests. Our proposed solution to this issue is to replay a computation trace instead of relying on heuristics.

Fixpoint boom n :=
  match n with
  | 0 => 0
  | S k => (if Nat.eqb n 24 then 1 else 0) + boom k + boom k
  end.

Goal 1 + (boom 50 + 1) = (1 + boom 50) + 1.1 + (boom 50 + 1) = 1 + boom 50 + 1
Proof.1 + (boom 50 + 1) = 1 + boom 50 + 1
  Time cbn -[boom]. (* immediate, should be replayed by Qed *)Finished transaction in 0. secs (0.u,0.s) (successful)
S (boom 50 + 1) = S (boom 50 + 1)
  Time reflexivity. (* immediate *)Finished transaction in 0. secs (0.u,0.s) (successful)
Fail Timeout 10 Qed. (* takes forever *)The command has indeed failed with message:
Timeout!
Abort.

Rocq has a notion of computation using sequences of simple term rewritings called reductions. All intermediate terms along a sequences of rewrites are called "convertible". The intuition behind convertibility is that because of strong normalisation, convertible terms eventually reduce to the same normal form no matter in which order reductions are performed, and these terms are all be considered equal to that normal form. This definition doesn't convey clearly which terms are not convertible, so I prefer the characterisation of convertible terms as terms that can be transformed into one another with a sequence of conversions (reductions or expansions). All convertible terms are definitionally equal, which means that for all intents and purposes they are considered the same. For instance the following example uses 1 + 1 and 2 indistinguishably:


Definition fin_two : Fin.t 2 := Fin.F1.
Definition fin_one_plus_one : Fin.t (1 + 1) := Fin.F1.
Check fin_two.fin_two
     : Fin.t 2
Check fin_two : Fin.t (1 + 1).fin_two : Fin.t (1 + 1)
     : Fin.t (1 + 1)
Check fin_one_plus_one.fin_one_plus_one
     : Fin.t (1 + 1)
Check fin_one_plus_one : Fin.t 2.fin_one_plus_one : Fin.t 2
     : Fin.t 2

Checking for convertibility between terms is decidable thanks to strong normalisation as reducing both terms to a normal form and checking for equality is sufficient. This works fine for small problems:

Goal boom 3 = boom 2 + boom 2.boom 3 = boom 2 + boom 2
Proof.boom 3 = boom 2 + boom 2
  Time vm_compute. (* reduces fast until normal form *)Finished transaction in 0. secs (0.u,0.s) (successful)
0 = 0
  reflexivity.
Time Qed. (* also reduces fast until normal form *)Finished transaction in 0. secs (0.u,0.s) (successful)

But it breaks down in more complex cases:

Goal boom 26 = boom 25 + boom 25.boom 26 = boom 25 + boom 25
Proof.boom 26 = boom 25 + boom 25
  Time vm_compute. (* takes time to reduce to normal form *)Finished transaction in 10.147 secs (10.146u,0.s) (successful)
4 = 4
  reflexivity.
Time Qed. (* takes the same time *)Finished transaction in 10.498 secs (10.497u,0.s) (successful)

The example above gets slow even though the proof should be simple: unfold boom once, reduce the if, and reduce the 0 + term. In general to validate convertibility, reducing to a common term is more efficient than reducing to a normal form. However to validate inconvertibility, reducing to different terms is not enough as two different terms might just not be reduced to normal form. Validating that two terms (not necessarily in normal form) are not convertible requires reducing them until the head of the term cannot be reduced anymore (is stuck), then they are not convertible if their head doesn't match or if any direct subterm of one term is not convertible with the corresponding direct subterm of the other term. So an algorithm that checks for convertibility between two terms not necessarily in normal form, without reducing them, will either answer with the convertibility of the terms, or answer that the terms are not reduced enough to decide. Below are examples of convertibility problems on not fully reduced terms which showcase all the possible situations (below, ~?~ indicates a convertibility problem):

Decidable without further reduction, and convertible:

fun x => fib (2 + x) ~?~ fun x => fib (2 + x)
Not decidable without further reduction (head is stuck then next head is not). Who knows what the definition of fib is without reducing it?

fun x => fib (2 + x) ~?~ fun x => fib x + fib (1 + x)
Decidable without further reduction, and not convertible (an inl is definitely not an inr):

inl (fib 10) ~?~ inr (fact 4)

Let's call this convertibility check without reduction "immediate convertibility". A convertibility checking algorithm can therefore solve convertibility problems as follows: Take two terms, apply reductions until immediate convertibility is decidable, then return the result of the immediate convertibility check.

The algorithm used by Rocq does this with heuristics to decide which term to reduce, and which reduction to apply on which subterm:

Both terms are reduced to beta-zeta-iota weak head normal form (βζι-whnf, not to be confused with the normal form mentioned previously). That is, the head of terms is reduced using beta, zeta, and iota reductions (i.e: all except delta) until it is stuck, which means it can either be delta reduced or it is truly stuck.
Then, the reduced terms are tested for immediate convertibility.
If immediate convertibility did not decide then either
- the head of both terms are stuck: the algorithm is applied recursively on all subterms that prevented decidability of the immediate convertibility;
  
  For instance in eq termA termB ~?~ eq termC (termD termE), eq is a stuck constructor so the convertibility of both termA ~?~ termC, and termB ~?~ termD termE are checked.
- the head of only one of the terms is not stuck: that term is put in whnf (with delta) and the algorithm is reapplied on the resulting terms;
  
  For instance in fun x => S x ~?~ plus 1, the left term is stuck, plus isn't so plus 1 is put in whnf.
- neither of the heads are stuck and they are different global names: an oracle decides which head to delta reduce then the algorithm is reapplied on the obtained terms;
  
  For instance in fact 10 ~?~ fib 20, fact and fib aren't the same, the oracle choses one of them to delta reduce.
- neither of the heads are stuck and they are the same global name: do as if both head were stuck, and if any one of the subterm paris is not convertible then do as if both head aren't the same global;
  
  For instance in fact 10 ~?~ fact 11, heads are the same so the convertibility of 10 ~?~ 11 are checked, and they aren't convertible so the oracle decides to reduce one of the fact.

This algorithm uses several tricks to try to be efficient.

The algorithm is merged with the immediate convertibility check. Without that the term would have to be traversed down again to find where the immediate convertibility check failed and to start reducing.
Both terms are reduced upfront by putting them in βζι-whnf in order to detect inconvertibility quicker.
Delta reductions are not performed upfront, and if the head are the same then assume it is more likely that the rest of the term prevents the immediate convertibility check from deciding.

The second and third tricks are the main heuristics the algorithm employs. They can be really effective at cutting down computations compared to fully reducing terms. The convertibility test algorithm is showcased by simulating it on the two examples below.

First example:

Goal length (1 :: 2 :: 3 :: 4 :: nil) = 1 + length (2 :: 3 :: 4 :: nil).length (1 :: 2 :: 3 :: 4 :: nil) =
1 + length (2 :: 3 :: 4 :: nil)

Both terms are already in βζι-whnf. Their head do not match, so the oracle decides to delta reduce the left head.

  set (to_reduce := length (1 :: 2 :: 3 :: 4 :: nil)).to_reduce:= length (1 :: 2 :: 3 :: 4 :: nil): nat
to_reduce = 1 + length (2 :: 3 :: 4 :: nil)
  lazy delta [length] in to_reduce.to_reduce:= (fun A : Type =>
 fix length (l : list A) : nat :=
   match l with
   | nil => 0
   | (_ :: l')%list => S (length l')
   end) nat (1 :: 2 :: 3 :: 4 :: nil)%list: nat
to_reduce = 1 + length (2 :: 3 :: 4 :: nil)

Left term is put in βζι-whnf.

  lazy beta zeta fix in to_reduce; lazy match in to_reduce.to_reduce:= S
  ((fix length (l : list nat) : nat :=
      match l with
      | nil => 0
      | (_ :: l')%list => S (length l')
      end) (2 :: 3 :: 4 :: nil)%list): nat

  subst to_reduce.S
  ((fix length (l : list nat) : nat :=
      match l with
      | nil => 0
      | (_ :: l')%list => S (length l')
      end) (2 :: 3 :: 4 :: nil)%list) =
1 + length (2 :: 3 :: 4 :: nil)

The head of the terms do not match. The left head is fully stuck so the right term is put in whnf.

  set (to_reduce := 1 + length (2 :: 3 :: 4 :: nil)).to_reduce:= 1 + length (2 :: 3 :: 4 :: nil): nat
S
  ((fix length (l : list nat) : nat :=
      match l with
      | nil => 0
      | (_ :: l')%list => S (length l')
      end) (2 :: 3 :: 4 :: nil)%list) = to_reduce
  lazy beta zeta iota delta [plus] in to_reduce.to_reduce:= S (length (2 :: 3 :: 4 :: nil)): nat

Heads are S constructors (match), but there is no immediate convertibility between the next heads (delta expanded length) and length. The left head is fully stuck, so the right term is put in whnf.

  lazy beta delta [length] in to_reduce.to_reduce:= S
  ((fix length (l : list nat) : nat :=
      match l with
      | nil => 0
      | (_ :: l')%list => S (length l')
      end) (2 :: 3 :: 4 :: nil)%list): nat
S
  ((fix length (l : list nat) : nat :=
      match l with
      | nil => 0
      | (_ :: l')%list => S (length l')
      end) (2 :: 3 :: 4 :: nil)%list) = to_reduce
  subst to_reduce.S
  ((fix length (l : list nat) : nat :=
      match l with
      | nil => 0
      | (_ :: l')%list => S (length l')
      end) (2 :: 3 :: 4 :: nil)%list) =
S
  ((fix length (l : list nat) : nat :=
      match l with
      | nil => 0
      | (_ :: l')%list => S (length l')
      end) (2 :: 3 :: 4 :: nil)%list)
Abort.

The terms are immediately convertible, therefore convertible.

Second example, this time a perfect oracle is also simulated:

Goal boom 26 = boom 25 + boom 25.boom 26 = boom 25 + boom 25

Both terms are already in βζι-whnf, but their heads do not match, so the oracle decides to delta reduce left.

  set (to_reduce := boom 26).to_reduce:= boom 26: nat
to_reduce = boom 25 + boom 25
  lazy delta [boom] in to_reduce.to_reduce:= (fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 26: nat
to_reduce = boom 25 + boom 25

Then the left term is put in βζι-whnf.

  lazy beta zeta fix in to_reduce; lazy match in to_reduce.to_reduce:= (if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 25 +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 25: nat

  subst to_reduce.(if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 = boom 25 + boom 25

The head + matches so its arguments are checked for convertibility. The implementation has the last arguments checked for convertibility first so we do the same, though the order usually doesn't matter. There is no immediate convertibility between (delta expanded boom) 25 and boom 25, so the algorithm checks for convertibility recursively. First step is to put both terms in βζι-whnf (already done for boom 25).

  set (to_reduce := _ 25) at 3.to_reduce:= (fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 25: nat
(if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 + to_reduce = 
boom 25 + boom 25
  lazy beta zeta fix in to_reduce; lazy match in to_reduce.to_reduce:= (if Nat.eqb 25 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 24 +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 24: nat

  subst to_reduce.(if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 +
((if Nat.eqb 25 24 then 1 else 0) +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24 +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24) = boom 25 + boom 25

The term heads do not match (+ is not boom), so the oracle decides to delta reduce right.

  set (to_reduce := boom 25) at 2.to_reduce:= boom 25: nat
(if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 +
((if Nat.eqb 25 24 then 1 else 0) +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24 +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24) = boom 25 + to_reduce
  lazy delta [boom] in to_reduce.to_reduce:= (fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 25: nat
(if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 +
((if Nat.eqb 25 24 then 1 else 0) +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24 +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24) = boom 25 + to_reduce

Then the right term is put in βζι-whnf.

  lazy beta zeta fix in to_reduce; lazy match in to_reduce.to_reduce:= (if Nat.eqb 25 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 24 +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) +
       boom k + boom k
   end) 24: nat

  subst to_reduce.(if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 +
((if Nat.eqb 25 24 then 1 else 0) +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24 +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24) =
boom 25 +
((if Nat.eqb 25 24 then 1 else 0) +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24 +
 (fix boom (n : nat) : nat :=
    match n with
    | 0 => 0
    | S k =>
        (if Nat.eqb n 24 then 1 else 0) + boom k +
        boom k
    end) 24)

The last arguments of + are immediately convertible now, so return that result. Please note that in this process of exiting the recursive call, the reduction performed on the arguments are forgotten. We will simulate it by asbtracting both subterms:

  set (convertible := (if Nat.eqb 25 24 then 1 else 0) + _ 24 + _ 24).(if Nat.eqb 26 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 25 + convertible = 
boom 25 + convertible

We are back in a context where the head (+) matches, and we now know that the last arguments are convertible. The first arguments are checked next. They are not immediately convertible so the convertibility check is called recursively here too. Both terms are already in βζι-whnf, their heads do not match, so the oracle decides to delta reduce left.

  set (to_reduce := plus) at 2; lazy delta [plus] in to_reduce;
    subst to_reduce.(fix add (n m : nat) {struct n} : nat :=
   match n with
   | 0 => m
   | S p => S (add p m)
   end) (if Nat.eqb 26 24 then 1 else 0)
  ((fix boom (n : nat) : nat :=
      match n with
      | 0 => 0
      | S k =>
          (if Nat.eqb n 24 then 1 else 0) + boom k +
          boom k
      end) 25) + convertible = 
boom 25 + convertible

The left terms is already in βζι-whnf, with new head being Nat.eqb. This new head still doesn't match +, so the oracle decides to delta reduce left again.

  set (to_reduce := if Nat.eqb 26 24 then 1 else 0).to_reduce:= if Nat.eqb 26 24 then 1 else 0: nat
(fix add (n m : nat) {struct n} : nat :=
   match n with
   | 0 => m
   | S p => S (add p m)
   end) to_reduce
  ((fix boom (n : nat) : nat :=
      match n with
      | 0 => 0
      | S k =>
          (if Nat.eqb n 24 then 1 else 0) + boom k +
          boom k
      end) 25) + convertible = 
boom 25 + convertible
  lazy delta [Nat.eqb] in to_reduce.to_reduce:= if
 (fix eqb (n m : nat) {struct n} : bool :=
    match n with
    | 0 =>
        match m with
        | 0 => true
        | S _ => false
        end
    | S n' =>
        match m with
        | 0 => false
        | S m' => eqb n' m'
        end
    end) 26 24
then 1
else 0: nat
(fix add (n m : nat) {struct n} : nat :=
   match n with
   | 0 => m
   | S p => S (add p m)
   end) to_reduce
  ((fix boom (n : nat) : nat :=
      match n with
      | 0 => 0
      | S k =>
          (if Nat.eqb n 24 then 1 else 0) + boom k +
          boom k
      end) 25) + convertible = 
boom 25 + convertible

The left subterm is put in βζι-whnf.

  lazy beta zeta iota in to_reduce.to_reduce:= 0: nat
(fix add (n m : nat) {struct n} : nat :=
   match n with
   | 0 => m
   | S p => S (add p m)
   end) to_reduce
  ((fix boom (n : nat) : nat :=
      match n with
      | 0 => 0
      | S k =>
          (if Nat.eqb n 24 then 1 else 0) + boom k +
          boom k
      end) 25) + convertible = 
boom 25 + convertible
  subst to_reduce.(fix add (n m : nat) {struct n} : nat :=
   match n with
   | 0 => m
   | S p => S (add p m)
   end) 0
  ((fix boom (n : nat) : nat :=
      match n with
      | 0 => 0
      | S k =>
          (if Nat.eqb n 24 then 1 else 0) + boom k +
          boom k
      end) 25) + convertible = 
boom 25 + convertible
  (* still not in βζι-whnf,
    I wish I could control reduction without set/subst ;)
  *)
  set (to_reduce := _ 0);
    lazy beta zeta fix in to_reduce; subst to_reduce; lazy beta match;
    set (to_reduce := _ 25) at 1;
    lazy beta zeta fix in to_reduce; lazy match in to_reduce;
    subst to_reduce.(if Nat.eqb 25 24 then 1 else 0) +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 24 +
(fix boom (n : nat) : nat :=
   match n with
   | 0 => 0
   | S k =>
       (if Nat.eqb n 24 then 1 else 0) + boom k +
       boom k
   end) 24 + convertible = 
boom 25 + convertible

The heads do not match, but we already know that the subterms are convertible because we did it before. The algorithm doesn't know it though, so it has to redo the work. (we omit this part here.) Also in reality the oracle isn't smart enough to make the choices I presented:

Abort.

Goal boom 26 = boom 25 + boom 25.boom 26 = boom 25 + boom 25
  Fail Timeout 10 reflexivity.The command has indeed failed with message:
Timeout!
Abort.

Here are some notes about this algorithm:

I do not know how the oracle works and did not investigate. For the purpose of this post it does not matter, so let's assume we're in the best case (i.e: the oracle makes optimal choices).
The whnf is mainly justified by a significantly reduced amount of computation to expose a stuck term at the head compared to other reduction strategies. The reduction used by the whnf are justified by the third heuristics. Because iota reduction is the main driving force behind recursion, one would intuitively expect them to be avoided in the whnf. Zeta and beta reductions can also make the term explode in size, so their inclusion in the whnf also mandates a justification. Well, it turns out this choice is not as arbitrary as it may seem. The heads of both terms have to be equal for the third heuristics to apply, requiring them be compared every time the whnf is stuck. The result of this check cannot be cached as the head is immediately reduced if the test fails. So the check has to be fast. Only names can be compared quickly; removing iota from the whnf would leave a fix, removing beta would leave a lambda, removing zeta would leave a let binding, and all of these would require an expensive term comparison.
When subterms are tested for convertibility as part of the third trick and one of the tests fails, none of the intermediate convertibility results or reduced terms are kept. This lead to significant duplicate computations. In my opinion this could be a potential improvement to the current algorithm.
The third heuristic can trigger many convertibility checks of inconvertible terms which require significant amout of computation to get to a result — even if all the computations were cached — leading to more computations than fully reducing both terms using an efficient strategy. This, together with the potentially wrong choices of the oracle, are one of the main sources of Qed taking forever to pass.

Here is an example exploiting this flaw of the algorithm to force it to spend time in checking convertibility of inconvertible terms while the overall terms are convertible using a few reductions:

The condition to trigger the third trick of the algorithm is that both heads are equal, in this example it is apply.

Definition apply [T U] f (x: T): U := f x.

The arguments to apply will need to be inconvertible, but this check needs to take as much time as possible. For that we use iter which uses an accumulator x to prevent exposing any constructor until n is fully computed. This function makes it possible to force the convertibility algorithm to compute anything before realising non-convertibility.

Fixpoint iter [T] x n f: T :=
  match n with
  | O => x
  | S n => iter (f x) n f
  end.

Finally the actual convertibility problem. We trick the convertibility algorithm into deciding the convertibility of either boom 26 with boom 25 or fun _ => apply (fun n => iter O n) (boom 25) with fun n => iter O n. The first convertibility check cannot be decided before fully computing boom 25, the second one cannot be decided before fully computing apply (fun n => iter O n) (boom 25) which uses iter to hide the head constructor behind the full computation of boom 25. So in both cases it takes forever, "forever" defined as the time it takes to compute boom 25.

Goal apply (fun _ => apply (fun n => iter O n) (boom 25)) (boom 26)
   = apply                 (fun n => iter O n) (boom 25).apply
  (fun _ : nat =>
   apply (fun n : nat => iter 0 n) (boom 25))
  (boom 26) =
apply (fun n : nat => iter 0 n) (boom 25)

Before being able to show that the algorithm takes forever we will first show that these terms are actually convertible by unfolding the first apply and beta reducing:

  set (p := apply) at 1.p:= apply: forall T U : Type, (T -> U) -> T -> U
p nat ((nat -> nat) -> nat)
  (fun _ : nat =>
   apply (fun n : nat => iter 0 n) (boom 25))
  (boom 26) =
apply (fun n : nat => iter 0 n) (boom 25)
  unfold apply in p. (* delta *)p:= fun (T U : Type) (f : T -> U) (x : T) => f x: forall T U : Type, (T -> U) -> T -> U
p nat ((nat -> nat) -> nat)
  (fun _ : nat =>
   apply (fun n : nat => iter 0 n) (boom 25))
  (boom 26) =
apply (fun n : nat => iter 0 n) (boom 25)
  subst p.(fun (T U : Type) (f : T -> U) (x : T) => f x) nat
  ((nat -> nat) -> nat)
  (fun _ : nat =>
   apply (fun n : nat => iter 0 n) (boom 25))
  (boom 26) =
apply (fun n : nat => iter 0 n) (boom 25)
  lazy beta. (* 3 beta *)apply (fun n : nat => iter 0 n) (boom 25) =
apply (fun n : nat => iter 0 n) (boom 25)
  reflexivity. (* and that's it *)

Can the convertibility algorithm figure this out? The answer is obviously no, we built this example problem for this exact purpose.

  Fail Timeout 10 Qed.The command has indeed failed with message:
Timeout!
Abort.

Before proposing an alternative I would like to re-clarify a point. There are instances of convertibility problems for which the convertibility of the two terms is unknown. For these problems we need an algorithm able to make reduction guesses to efficiently check convertibility. However in most cases convertibility is already known so there is no reason to guess how to reduce to a common term from scratch. The only thing that should have to be done is checking that the untrusted thing which produced the convertibility problem was indeed correct in its result. So we only need a certificate checker. Then we can remove the convertibility checking algorithm from the kernel and make it produce a certificate, leaving only the certificate checker in the kernel. This in turn would allow for experimenting potentially better but untrusted convertibility checkers with different heuristics. This is our proposal to solve almost all slow and never ending Qed.

Proposal

Because two convertible terms can be transformed into one another with a sequence of conversions, a certificate of convertibility could (as one option) be that sequence of conversions. A certificate checker would only have to replay the transformation on the first term and verify that the obtained term is the second one. The representation of the conversions is important for the checker to be able to replay them. One could imagine for instance representing the certificate as a sequence of terms. I raise this possibility as it would not require modifying Rocq's core, but only to add additional cast nodes in the proof term when running reduction tactics. Each term would have to be checked for convertibility with the next one, so we'd have to be careful when picking the cast nodes to avoid slow convertibility problems. The usual reduction tactics are unlikely to go through such a sequence of terms, and finding this sequence would be complicated. (In fact, I highly doubt that it is possible at all.) Another way to represent conversions can be to make explicit which conversion to apply and where in the term. The certificate checker would then only need to go to the specified place and apply the conversion, not relying on anything untrusted or potentially taking forever.

Conversions are made of reductions and their inverse, expansions. Given a term and a location, a given kind of reduction usually has a single resulting term, while expansions have many. This means that representing an expansion requires more than a location and the name of the kind of expansion. This information is usually as big as the term being expanded, meaning that the size of a sequence of length n of expansions on a term of length m is on the order of n × m. In contrast specifying the resulting term and the inverse of the sequence of expansions will give a sequence of size on the order of n + m. So instead of a sequence of conversions, a certificate of convertibility between t[0] and t[n] is better represented as a sequence of triplets (t[1], rl[1], rr[1]), ..., (t[n], rl[n], rr[n]) where for each k, t[k] is a term, and rl[k] and rr[k] are sequences of reductions. Starting from term t[0], getting to term t[n] is done by successively reducing t[k-1] and t[k] to a common term using reduction sequences rl[k] and rr[k] respectively at step k. Instead of storing pairs of reductions sequences, it is possible to use a single sequence of reductions and a mark indicating which term to reduce to the other. However it requires writing more terms, including the intermediate terms to which each t[k] reduces to in the triplet representation. These intermediate terms tend to be bigger because of delta reduction, so we favor the triplet representation. This way of representing convertibility problems basically replaces one arbitrary sequence of conversions with a sequence of convertibility problems where only reductions are allowed. Rocq can already represent a sequence of convertibility problems with nested casts. The only part left to figure out is how to represent a sequence of reductions.

In the previous paragraph is mentioned that given a term and a location, a given kind of reduction "usually" has a single resulting term.

"Usually" has one exception in Rocq: in match patterns, variables can bind to the value set by let bindings in a type definition as shown by the example below. Because of this, zeta reductions, usually eliminating let bindings, can be applied to match. This zeta reduction on match does not eliminate the let binding and does not modify the term at all if the bound variable is not used. This variant of the zeta reduction needs to be provided with a way to identify the target let binding. Even though one could expect this to be purely syntactic (it could be), it is actually part of kernel terms and cannot be ignored. [1]

The zeta rule is also an exception (although not the only one) for a nice property which could have further simplified the representation of reductions. In the calculus of constructions, for any given subterm there is at most one reduction possible at the root of the subterm. The exception to that property is eta reduction, which doesn't matter because the immediate convertibility check can be modified to infer eta expansion. If that property held in Rocq, only a sequence of locations would have been enough to describe a seqence of reductions. Unfortunately Rocq also feature reduction rules, an experimental extension allowing the expression of arbitrary rewrite rules without much constraint on the expressed rewrite, their impact on reduction and on typing. So a reduction is a pair of a location and a reduction kind, with aditional data for the zeta reduction. The final missing part of the picture is how to represent a location in a term.

Inductive VLB := vlb: let vlbI := I in VLB.
Record RLB := rlb { rlbI := I }.

Definition match_vlbI :=
  match vlb with
  | vlb x => x
  end.

Definition match_rlbI :=
  match rlb with
  | rlb x => x
  end.

(* more usual syntax *)
Definition match_rlbI' :=
  match rlb with
  | {| rlbI := x |} => x
  end.

Terms are trees, so a location in a term can be expressed as a location in a tree. However, there is a more efficient representation relying on how common reduction strategies operate. Common reduction strategies don't move randomly in a term, in fact they do quite the opposite, they tend to focus on a subterm, do several reductions at the same place, then once they unfocus the subterm, they do not focus on it again. So they overall change location less often than they reduce, and these changes in location are very local. Which makes full tree location wasteful compared to relative movements. A sequence of reductions is then a sequence of either movements toward the root, movements toward a leaf, or reduction kinds with aditional data for the zeta reduction. This sequence is easy to replay with a zipper, and recording it is a matter of recording a trace.

To record a reduction trace, the zipper needs to be unalterable without using its API. One way to do this is using inversion of control. Inversion of control transforms an outer reducing function which calls zipper functions into a callback function that returns a value deciding which functions the zipper should call on itself. The reduction function can no longer arbitrarily (and potentially incorrectly) transform the zipper, so the reduction algorithm is necessarily correct. This loss of freedom means the set of reductions has to be carefully crafted to allow reductions similar to the ones performed by optimised reduction algorithms, for instance replacing a single variable which value is available from context instead of performing a full beta or zeta reduction. Producing the certificate is then a matter of defining a callback which takes an arbitrary reduction callback, and registers the value of that callback before returning them unchanged to the zipper. The sequence of reduction may not even be needed if the reduction strategy is provided as a callback: the zipper cannot perfom unsound actions so as long as the immediate convertibility is able to decide convertibility after the reduction is done, how the reduction is controlled doesn't matter.

We hope that we convinced you of the feasibility and advantages of a certificate-based conversion checking algorithm over a guess-based one.