Automatically generated goal names

Adding automatically generated goal names to the Rocq prover.

Display all goals and responses

If you're ever written small to moderately large Rocq developments, it's quite likely that you have experienced the following problem:

You write your proofs using induction or case analysis, carefully making sure to list all (important) cases one by one.
Later, as your project naturally grows, you decide to add a new case (a new typing rule, a new kind of expression, etc.).
Suddenly, all proofs that you've written break, usually with no clear indication as to where and why the proof broke.

If this sounds familiar, you've bumped into one of the current pitfalls of goal selection: goals selectors such as bullets and numbers do not match constructors one-to-one (they lack a semantic connection to the constructor they're supposed to apply to).

I'll show in this post how we can improve the experience by automatically generating goal names, yielding a robust and future-proof method of selecting goals.

The problem

Let us build a small example to show that the problem arises even in minimalistic conditions. Consider the following inductive data type that represents a choice between 2 propositions:

Inductive Choice (A B: Prop): Prop :=
| ChoiceA : A -> Choice A B
| ChoiceB : B -> Choice A B.

This inductive looks a lot like the usual disjunction, and indeed it's possible to prove that Choice A B implies A \/ B:

Theorem choice_or : forall A B, Choice A B -> A \/ B.forall A B : Prop, Choice A B -> A \/ B
Proof.forall A B : Prop, Choice A B -> A \/ B
  intros A B HChoice.A, B: Prop
HChoice: Choice A B
A \/ B
  destruct HChoice.A, B: Prop
H: A
A \/ B
A, B: Prop
H: B
A \/ B
  -A, B: Prop
H: A
A \/ B apply or_introl.A, B: Prop
H: A
A assumption.
  -A, B: Prop
H: B
A \/ B apply or_intror.A, B: Prop
H: B
B assumption.
Qed.

So far, this proof looks reasonnable. But what if we decided to re-order the cases of Choice?

Inductive Choice (A B: Prop): Prop :=
| ChoiceB  : B -> Choice A B (* now ChoiceB comes first! *)
| ChoiceA  : A -> Choice A B.

Let's run the same proof again, and see what happens:

Theorem choice_or : forall A B, Choice A B -> A \/ B.forall A B : Prop, Choice A B -> A \/ B
Proof.forall A B : Prop, Choice A B -> A \/ B
  intros A B HChoice.A, B: Prop
HChoice: Choice A B
A \/ B
  destruct HChoice.A, B: Prop
H: B
A \/ B
A, B: Prop
H: A
A \/ B
  -A, B: Prop
H: B
A \/ B apply or_introl.A, B: Prop
H: B
A assumption.No such assumption.
A, B: Prop
H: B
A

Uh oh, our proof broke. The prover attempts to run the same proof as before, but this time the "first" constructor is ChoiceB (where "first constructor" means the constructor appearing first in the definition of the inductive), so that's the case the first bullet point applies to, and the proof fails. Fixing the problem (assuming that we quickly understand what happened, which can be quite tricky!) requires flipping the two bullets. This is frustrating, because the content of the proof did not change at all! And even worse: this problem repeats for every proof that destructed a Choice hypothesis, requiring manual edits across the whole code base despite no real meaningful change.

This problem happens all the time: when reordering constructors, but also when adding new constructors (everything gets shifted by one), or when changing automation so that old goals are removed or new goals added.

Fundamentally, the problem is that bullets follow the order in which constructors were defined, and have no semantic meaning by themselves. This is counter-intuitive: we want to be able to state what we prove in each case without being restricted to treating each constructor in a specific order. It also makes proofs hard to read: without running the proof, you can't tell what case a bullet is about.

Some developments such as Software Foundations annotate their proofs with comments to show which case the bullet is supposed to match:

Theorem choice_or : forall A B, Choice A B -> A \/ B.forall A B : Prop, Choice A B -> A \/ B
Proof.forall A B : Prop, Choice A B -> A \/ B
  intros A B HChoice.A, B: Prop
HChoice: Choice A B
A \/ B
  destruct HChoice.A, B: Prop
H: B
A \/ B
A, B: Prop
H: A
A \/ B
  - (* ChoiceB *)A, B: Prop
H: B
A \/ B apply or_intror.A, B: Prop
H: B
B assumption.
  - (* ChoiceA *)A, B: Prop
H: A
A \/ B apply or_introl.A, B: Prop
H: A
A assumption.
Qed.

This helps in reading the proof script, but not so much in maintaining it: if the proof breaks, the comments act as best as a hint to what happened, but the fix remains error-prone and time-consuming. By any metric, this is not a satisfactory solution.

Named goals

There's a lesser known but more robust alternative to selecting goals, which is to select them through their name. Indeed, it's possible in Rocq to a give a name to a goal, and refer to it later in the proof script by using explicitly named existential variables:

Theorem choice_or : forall A B, Choice A B -> A \/ B.forall A B : Prop, Choice A B -> A \/ B
Proof.forall A B : Prop, Choice A B -> A \/ B
  intros A B HChoice.A, B: Prop
HChoice: Choice A B
A \/ B
  destruct HChoice.A, B: Prop
H: B
A \/ B
A, B: Prop
H: A
A \/ B
  (* The syntax `?[name]` introduces an explicit name for the goal. *)
  1: refine ?[ChoiceB].A, B: Prop
H: B
ChoiceBA \/ B
A, B: Prop
H: A
A \/ B
  2: refine ?[ChoiceA].A, B: Prop
H: B
ChoiceBA \/ B
A, B: Prop
H: A
ChoiceAA \/ B
  (* Now, we can select them by their name. *)
  [ChoiceB]: {A, B: Prop
H: B
ChoiceBA \/ B right.A, B: Prop
H: B
B assumption. }A, B: Prop
H: A
ChoiceAA \/ B
  [ChoiceA]: {A, B: Prop
H: A
ChoiceAA \/ B left.A, B: Prop
H: A
A assumption. }
Qed.

refine ?[ChoiceB] gives the name ChoiceB to the first goal, which can then be referenced by the goal selector [ChoiceB]. Ignoring the clunky syntax used to name the goals, this is quite a bit better than our previous attempt! With named goals, we connect inductive constructors with their corresponding cases in the proof.

Named goals also come with several other advantages:

Labels are integrated in the goal selector, and thus comments are no longer necessary.
The proof structure is the same as in the version using bullets, but is more readable.
Cases can be treated in any order.
Adding a new constructor will not silently break proofs, but instead Rocq will predictably inform us that there is a missing subgoal in the proof.

More precisely: named goals would have all these advantages… if there was an easy, order-independent way to name goals.

Unfortunately, in the current state of Rocq, we are doubly stuck. First, we still have to assign the ?[ChoiceB] and ?[ChoiceA] names explicitly using goal numbers (1: and 2: above). We could use a semicolon, as in destruct H; [refine ?[ChoiceB] | refine ?[ChoiceA]], but it wouldn't help either: the branches of the [] are also ordered. Second, named goals suffer from poor usability, have a clunky syntax, and are not easy to manipulate, making them essentially never used in practice [1]. This calls for a better alternative.

Automatically generated goal names

The issue in the version of the proof using named goals is related to the fact that the user has to provide names themselves, making them prone to breakages when constructors are re-ordered. However, the names we chose could just as well have been generated by Rocq itself, since they matched constructor names.

This is the main idea behind automatically generated goal names. Generating goal names eases proof maintenance and improves readability of proof scripts, at very little cost.

Generated names

There are two main places where we can find names to use as goal names:

When doing case analysis/induction: we use constructor names as goal names. For instance, our previous proof now looks like this:

Theorem choice_or : forall A B, Choice A B -> A \/ B.forall A B : Prop, Choice A B -> A \/ B
Proof.forall A B : Prop, Choice A B -> A \/ B
  intros A B HChoice.A, B: Prop
HChoice: Choice A B
A \/ B
  destruct HChoice.A, B: Prop
H: B
ChoiceBA \/ B
A, B: Prop
H: A
ChoiceAA \/ B
  [ChoiceB]: {A, B: Prop
H: B
ChoiceBA \/ B apply or_intror.A, B: Prop
H: B
B assumption. }A, B: Prop
H: A
ChoiceAA \/ B
  [ChoiceA]: {A, B: Prop
H: A
ChoiceAA \/ B apply or_introl.A, B: Prop
H: A
A assumption. }
Qed.

From named hypotheses: when applying a theorem or constructor that has named hypotheses, it is natural to use the hypothesis name as the goal name. These can be usual variables, as in the following example:

Theorem even_or_odd :
  forall n : nat, even n \/ odd n.forall n : nat, even n \/ odd n
Proof.forall n : nat, even n \/ odd n
  eapply well_founded_ind.
Rwfwell_founded ?R
forall x : nat,
(forall y : nat, ?R y x -> even y \/ odd y) ->
even x \/ odd x
  (* We must fill in `R` *)
  [R]: exact lt.
Rwfwell_founded lt
forall x : nat,
(forall y : nat, y < x -> even y \/ odd y) ->
even x \/ odd x
  (* ... *)

They can also be named hypotheses: examples of such hypotheses abound, be it in theorem statements (taken from ZMicromega.v; notice the AGREE hypothesis):

Lemma agree_env_eval_nformula :
  forall env env' e
    (AGREE : agree_env (max_var xH (fst e))  env env'),
    eval_nformula env e  <->  eval_nformula env' e.Pol, PolEnv: Type -> Type
Op1: Type
max_var: positive -> Pol Z -> positive
agree_env: positive -> PolEnv Z -> PolEnv Z -> Prop
NFormula:= fun C : Type => (Pol C * Op1)%type: Type -> Type
eval_nformula: PolEnv Z -> NFormula Z -> Prop
forall (env env' : PolEnv Z) (e : Pol Z * Op1),
agree_env (max_var 1 (fst e)) env env' ->
eval_nformula env e <-> eval_nformula env' e

… or in sections with Context or Hypothesis commands (this one is from Permut.v, all the way back in 1995!):

Section Axiomatisation.
  Variable U : Type.
  Variable op : U -> U -> U.
  Variable cong : U -> U -> Prop.

  Hypothesis op_comm : forall x y : U,
    cong (op x y) (op y x).
  Hypothesis op_ass : forall x y z : U,
    cong (op (op x y) z) (op x (op y z)).
  Hypothesis cong_left : forall x y z : U,
    cong x y -> cong (op x z) (op y z).
  Hypothesis cong_right : forall x y z : U,
    cong x y -> cong (op z x) (op z y).
  Hypothesis cong_trans : forall x y z : U,
    cong x y -> cong y z -> cong x z.
  Hypothesis cong_sym : forall x y : U,
    cong x y -> cong y x.

… or in constructors, where it's common practice to use all-caps (this one is from Compcert's backend/Mach.v):

Inductive wf_frame: stackframe -> Prop :=
| wf_stackframe_intro: forall fb sp ra c f
    (CODE: Genv.find_funct_ptr ge fb = Some (Internal f))
    (LEAF: is_leaf_function f = false)
    (TAIL: is_tail c f.(fn_code)),
  wf_frame (Stackframe fb sp ra c).

In fact, support for the induction tactic comes from named hypotheses, since induction applies the induction scheme generated by Rocq for the inductive. Therefore, we only have to change the hypothesis names in the induction scheme to get proper goal names:

(* Notice the `ChoiceB` and `ChoiceA` hypotheses, which were
   previously named `f` and `f0`, and the use of `Intro.` to
   qualify the constructors in the match. *)
Print Choice_ind.Choice_ind =
fun (A B P : Prop) (ChoiceB : B -> P)
  (ChoiceA : A -> P) (c : Choice A B) =>
match c with
| Swapped.ChoiceB _ _ b => ChoiceB b
| Swapped.ChoiceA _ _ a => ChoiceA a
end
     : forall A B P : Prop,
       (B -> P) -> (A -> P) -> Choice A B -> P

Arguments Choice_ind (A B P)%_type_scope
  (ChoiceB ChoiceA)%_function_scope 
  c

With those changes, all proofs that rely on induction, destruct, eapply (or other e-based tactics), or lemmas with named hypotheses can immediately be rewritten in a more natural and pleasant style!

Isn't this a breaking change?

As we are introducing new names in all proofs and changing existing ones in induction schemes, it is natural to ask whether this breaks existing proofs, and whether people were relying on the default premise names in induction schemes.

It turns out that, as mentionned before, named goals are so rare in practice that generated names do not break existing proofs, but rather augment them with additional ways to select goals. Across the 100 projects that are present in Rocq's CI, two projects required updates to their test suites, and generating goal names triggered only one bug in VST which is unrelated to our implementation.

Dealing with ambiguity: qualified names

When doing nested case analysis or induction, it is often the case that we're destructing the same inductive multiple times, generating identical names for our subgoals. For example, destructing two Choice values yields conflicting names:

Theorem two_choices_or : forall A B C,
  Choice (Choice A B) C -> (A \/ B) \/ C.forall A B C : Prop,
Choice (Choice A B) C -> (A \/ B) \/ C
Proof.forall A B C : Prop,
Choice (Choice A B) C -> (A \/ B) \/ C
  intros A B C HChoice.A, B, C: Prop
HChoice: Choice (Choice A B) C
(A \/ B) \/ C
  destruct HChoice.A, B, C: Prop
H: C
ChoiceB(A \/ B) \/ C
A, B, C: Prop
H: Choice A B
ChoiceA(A \/ B) \/ C
  [ChoiceA]: destruct H.A, B, C: Prop
H: C
ChoiceB(A \/ B) \/ C
A, B, C: Prop
H: B
ChoiceA.ChoiceB(A \/ B) \/ C
A, B, C: Prop
H: A
ChoiceA.ChoiceA(A \/ B) \/ C
  (* Oops, we now have two ChoiceB cases... *)

In informal mathematics, one usually resolves the ambiguity by stating the "full name" of the case, saying "the right case of the left case" to differentiate it from "the right case".

Our idea here is the same: goal names are now qualified by their context, similarly to how definitions have a fully qualified name representing their surrounding context (module, file, etc.). However, contrary to other qualified names in Rocq, goals only have a single name, which is the shortest name by construction.

To represent qualified names, we use a (reversed) trie that uses the dot character to separate different qualification levels. For example, the proof above yields the following trie:

The trie representation of qualified goal names

When resolving ChoiceA.ChoiceB to the second subgoal in the trie, one reads the qualified name right-to-left, following first the ChoiceB arrow, and then the ChoiceA arrow. This representation gives us shortest names for free, since one can stop insertion at the first empty node.

Trying it out

All of the details described above have already been merged in Rocq, and will be available in Rocq 9.2. If you'd like to try out this feature today, you can install the latest dev version of Rocq using opam by pinning the rocq repository, like so:

opam switch create rocq-master 4.14.0
eval $(opam env --switch=rocq-master)
opam pin --yes git+https://github.com/rocq-prover/rocq.git#master

You can then freely experiment with this feature by adding Set Generate Goal Names to the top of your files.

Future work

One natural extension to named goals is to consider wildcards, i.e., selectors of the form [A.*] or [*.B]. These extensions should be easy to implement given the current implementation, but it is not clear however what should be the semantics of wildcards. In particular, the semantics are non-obvious if we allow several wildcards: depending on semantics, [*.*] could either select all goals, or all goals with exactly 2 levels of qualification.

Another line of work could be to change generated hypothesis names to improve readability of proof scripts. It is very common to rely on provided names such as H or H0 in proof scripts, because these names are more convenient to use than other options. Some proofs however could benefit from better hypothesis names: for instance, our proof of choice_or generated an hypothesis H in both cases, but it would be more natural to get HChoiceA and HChoiceB instead. Some interesting work in this area includes Pierre Courtieu's LibHyps library, which allows to methodically manipulate and rename introduced hypotheses. (Note, however, that unlike the work in this post any change that affects hypothesis names is bound to break a lot of proofs.)

Conclusion

Named goals provide a robust method to select goal names, but become much more powerful and usable when names are automatically generated. They yield readable proofs that do not break under re-ordering or addition of inductive constructors, improving on current methods of goal selection and making proof scripts more reliable and readable.

Systems and
Formalisms Lab

Automatically generated goal names

The problem

Named goals

Automatically generated goal names

Generated names

Isn't this a breaking change?

Dealing with ambiguity: qualified names

Other Quality-of-Life improvements

Trying it out

Future work

Conclusion