How to write a type-safe unwrap (aka fromJust)

Tips and tricks for writing functions that take proofs as arguments.

Display all goals and responses

Let's say you've written a programming language in Coq. You have nice inductives for your ASTs; one for untyped terms (UntypedAST) and one for typed terms (TypedAST). You wrote a simple typechecker, and maybe an interpreter, too!

typecheck
     : UntypedAST -> option TypedASTinterp
     : TypedAST -> Value

You write a few programs…

Example well_typed := UAdd (UNat 1) (UNat 1).
Example ill_typed := UAdd (UNat 1) (UBool true).

… typecheck them:

Definition tc_good := typecheck well_typed.
Compute tc_good. (* Accepted: So far so good. *)= Some
    {|
      tau := Nat; ast := TAdd (TNat 1) (TNat 1)
    |}
: option TypedAST

Definition tc_bad := typecheck ill_typed.
Compute tc_bad. (* Rejected: all good as well. *)= None
: option TypedAST

… and attempt to run them:

Compute interp tc_good.The term "tc_good" has type "option TypedAST"
while it is expected to have type "TypedAST".

D'oh! interp takes a TypedAST, but typecheck returns an option. What do we do?

We can write a simple wrapper though, with a default value for the None case:

Definition unwrap_default (o: option TypedAST) : TypedAST :=
  match o with
  | Some t => t
  | None => {| ast := TNat 1 |}
  end.

Compute interp (unwrap_default tc_good).= {| vtau := Nat; val := 2 |}
: Value

But now we silently swallow type errors, which isn't ideal:

Compute interp (unwrap_default tc_bad).= {| vtau := Nat; val := 1 |}
: Value

Let's see how we can get a safe but convenient version of unwrap (aka fromJust in the Haskell world and Option.get in OCaml).

Take 1: Pass a proof as an extra argument

The most straightforward way is to generalize unwrap by adding a proof that its argument is not None:

 Definition unwrap {A} (o: option A)
            (not_none: o <> None) : A :=
   match o return _ = o -> A with
   | Some a => fun _ => a
   | None => fun is_none => False_rect _ (not_none is_none)
   end eq_refl.

… it works, but it's not much fun for callers:

Compute interp (unwrap tc_good
  (fun some_eq_none =>
    @eq_rect_r (option TypedAST) None
               (fun o: option TypedAST =>
                  if o then False else True)
               I tc_good some_eq_none)).= {| vtau := Nat; val := 2 |}
: Value

We can improve things slightly with tactics in terms:

Compute interp (unwrap tc_good ltac:(discriminate)).= {| vtau := Nat; val := 2 |}
: Value
Compute interp (unwrap tc_bad ltac:(discriminate)).Not a discriminable equality.

… but the generated terms are not pretty, so if you ever store them unreduced anywhere, you're in for all sorts of unpleasantness:

Check (unwrap tc_good ltac:(discriminate)).unwrap tc_good
  ((fun H : tc_good = None =>
    let H0 : False :=
      eq_ind tc_good
        (fun e : option TypedAST =>
         match e with
         | Some _ => True
         | None => False
         end) I None H in
    False_ind False H0)
   :
   tc_good <> None)
     : TypedAST

Not great. Still, here is another example for comparison, this time using known-good indices into a list:

Definition nth_in_bounds {A} (l: list A) (n: nat)
           (in_bounds: n < List.length l) :=
  unwrap (List.nth_error l n)
         (proj2 (List.nth_error_Some l n) in_bounds).

Compute nth_in_bounds [1; 2; 3] 2
           ltac:(repeat constructor).= 3
: nat

Note that (maybe surprisingly) the computation doesn't block, despite the fact that the definition of nth_in_bounds uses an opaque proof List.nth_error_Some. The reason is that, as we've seen, unwrap doesn't actually look at the proof. In fact, in general, proofs don't tend to block computation, because Coq disallows elimination of informative Props into type (that is, programs that return non-Prop results can't inspect proofs — except non-informative ones, like eq_refl).

What happens if we just skip the proof completely?

Compute interp (unwrap tc_good _).= {| vtau := Nat; val := 2 |}
: Value

Huh?

The reason this works is that the definition of unwrap never really uses the proof — it only refers to it when deriving a contradiction, in the None branch — but the proof guarantees that this branch is unreachable! Writing (unwrap tc_good _) produces an open term (a term with holes), but Compute knows how to reduce those, so it proceeds without complaining.

Of course, trying to unwrap None doesn't go as smoothly: instead, Compute blocks on the unspecified proof that None <> None:

Compute (unwrap None _).     = match ?not_none eq_refl return ?A with
       end
     : ?A
where
?A : [ |- Type]
?not_none : [ |- None <> None]

And in recent Coq versions, this _ trick can even be used definitions:

Definition good :=
  Eval compute in interp (unwrap tc_good _).

Oh well. Still a nice party trick.

Take 2: Use an equality proof

The main pain point in the previous example was the complexity of the proof terms, so let's simplify them. Instead of proving o <> None, we'll prove that is_some o = true, and the proof will always be eq_refl:

 Definition is_some {A} (o: option A) : bool :=
   if o then true else false.

 Lemma is_some_not_none {A} {o: option A} :
   is_some o = true -> o <> None.A: Type
o: option A
is_some o = true -> o <> None
 Proof.A: Type
o: option A
is_some o = true -> o <> None destruct o.A: Type
a: A
is_some (Some a) = true -> Some a <> None
A: Type
is_some None = true -> None <> None all: cbn.A: Type
a: A
true = true -> Some a <> None
A: Type
false = true -> None <> None all: congruence. Qed.

Now we can define a new variant of unwrap:

 Definition unwrap_dec {A} (o: option A)
            (is_some_true: is_some o = true) : A :=
   unwrap o (is_some_not_none is_some_true).

 Compute interp (unwrap_dec tc_good eq_refl).= {| vtau := Nat; val := 2 |}
: Value

Much nicer! Now the proof is always the same, and we can even define a notation to hide it:

Notation unwrap_dec' o := (unwrap_dec o eq_refl).

Here's how it looks for list indices:

Definition nth_in_bounds_dec {A} (l: list A) (n: nat)
           (lt_true: (n <? List.length l) = true) :=
  nth_in_bounds l n (proj1 (Nat.ltb_lt _ _) lt_true).

  Compute nth_in_bounds_dec [1; 2; 3] 2 eq_refl.= 3
: nat

One significant advantage of this strategy is that we can control the reduction strategy used to check that eq_refl has the right type (ensuring that the application of unwrap_dec is well-typed requires checking that eq_refl: is_some _ = true, which requires reducing is_some _ to unify it with true). Concretely, we can write (@eq_refl bool true : is_some tc_good = true) to using normal unification, (@eq_refl bool true <: …) to call vm_compute, and <<: to call native_compute.

As before, though, the proof term that we're passing is in fact dead code, and the error messages are not ideal:

Compute interp (unwrap_dec tc_good _).= {| vtau := Nat; val := 2 |}
: Value
Compute interp (unwrap_dec None eq_refl).The term "eq_refl" has type
 "is_some None = is_some None"
while it is expected to have type
 "is_some None = true".
Compute interp (unwrap_dec None _).     = {|
         vtau :=
           match
             match
               is_some_not_none ?is_some_true eq_refl
               return TypedAST
             with
             end
           with
           | {| tau := tau |} => tau
           end;
         val :=
           (fix Ffix
              (x : Tau) (x0 : TypedAST' x) {struct x0} :
                match x with
                | Nat => nat
                | Bool => bool
                end :=
              match
                x0 in (TypedAST' H)
                return
                  match H with
                  | Nat => nat
                  | Bool => bool
                  end
              with
              | TNat n => n
              | TBool b => b
              | TAdd e1 e2 =>
                  (fix Ffix0
                     (x1 x2 : nat) {struct x1} :
                       nat :=
                     match x1 with
                     | 0 => x2
                     | S x3 => S (Ffix0 x3 x2)
                     end) (Ffix Nat e1) (Ffix Nat e2)
              | TAnd e1 e2 =>
                  if Ffix Bool e1
                  then Ffix Bool e2
                  else false
              end)
             match
               match
                 is_some_not_none ?is_some_true
                   eq_refl return TypedAST
               with
               end
             with
             | {| tau := tau |} => tau
             end
             match
               match
                 is_some_not_none ?is_some_true
                   eq_refl return TypedAST
               with
               end as c
               return
                 (TypedAST'
                    match c with
                    | {| tau := tau |} => tau
                    end)
             with
             | {| ast := ast |} => ast
             end
       |}
     : Value
where
?is_some_true : [ |- is_some None = true]

Take 3: Use a dependent return type

We know that we only intend to call unwrap with arguments that reduce to Some _. We can make this explicit in the return type, instead of changing the arguments:

Inductive error : string -> Type := Err (s: string) : error s.

Definition unwrap_dep {A} (o: option A)
  : if o then A else error _ :=
  match o with
  | Some a => a
  | None => Err "Expecting Some, got None"
  end.

Compute interp (unwrap_dep tc_good).= {| vtau := Nat; val := 2 |}
: Value

Here we're saying that we'll return an A if given a Some, and an error otherwise. And indeed, the error messages are much nicer:

Compute interp (unwrap_dep tc_bad).The term "unwrap_dep tc_bad" has type
 "if tc_bad
  then TypedAST
  else error "Expecting Some, got None""
while it is expected to have type "TypedAST".

Here's how it looks for list indices:

Definition nth_in_bounds_dep {A} (l: list A) (n: nat)
  : if lt_dec n (List.length l) then A else error _ :=
  match lt_dec n (List.length l) as cmp
    return (if cmp then A else error _) with
  | left in_bounds => nth_in_bounds l n in_bounds
  | right _ => Err "Index is out of bounds"
  end.

Compute nth_in_bounds_dep [1; 2; 3] 2.= 3
: if lt_dec 2 (Datatypes.length [1; 2; 3])
  then nat
  else error "Index is out of bounds"
Compute nth_in_bounds_dep [1; 2; 3] 7.= Err "Index is out of bounds"
: if lt_dec 7 (Datatypes.length [1; 2; 3])
  then nat
  else error "Index is out of bounds"

The commands above reduce values, but not their types: notice how the type of nth_in_bounds_dep [1; 2; 3] 2, for example, is printed as if lt_dec 2 (Datatypes.length [1; 2; 3]) then nat else error …. If you want to reduce types as well, the simplest is to use tactics-in-terms:

Notation compute_all term :=
  ltac:(let term := (eval compute in term) in
        exact_no_check term) (only parsing).

Check (compute_all (nth_in_bounds_dep [1; 2; 3] 2)).3
     : nat

The ltac:(…) parts says that we're going to derive a term using a proof script, and the call to exact_no_check supplies the term that we want, which we obtained using the eval Ltac primitive.

A puzzle for expert readers: Is it possible to write a version of this simplification tactic which reduces just the type (not the term) without adding a type annotation? In other words, can you do better than the following, which leaves a cast in the term?

Notation compute_in_type term :=
  ltac:(let type := type of term in
        let type := (eval compute in type) in
        exact_no_check (term: type)) (only parsing).

Check (compute_in_type (nth_in_bounds_dep [1; 2; 3] 2)).nth_in_bounds_dep [1; 2; 3] 2 : nat
     : nat

(The only way that I know is using a Definition or a let binding.)

Bonus 1: Using unification

After I wrote this post, my colleague Jason Gross showed me another quite clever implementation of unwrap, leveraging inference:

Notation unwrap_refl o :=
  ((fun v (pf : o = Some v) => v) _ eq_refl) (only parsing).

Compute unwrap_refl tc_good.= {| tau := Nat; ast := TAdd (TNat 1) (TNat 1) |}
: TypedAST

The trick here is to force unification to infer the value inside the option: Coq will unify o = Some ?v (the type of pf) with ?a = ?a (the type of eq_refl), and instantiate ?v in passing, which the function then returns. Nifty!

Bonus 2: Using tactics in terms

Here's one final way to proceed with this, using tactics in terms:

Notation unwrap_tac opt :=
  ltac:(match (eval hnf in opt) with
        | Some ?v => exact v
        | ?other => fail "Error:" other "isn't [Some _]"
        end) (only parsing).

In practice, it works OK, but hnf is very slow (it's based on the same code as simpl). The cbv tactic and its faster cousins like vm_compute and native_compute are usually faster, but they get very costly if the terms are large and don't need to be fully normalized to determine whether we're in the Some or None case (think of a case like Some (very large term), where hnf will be free and cbv very slow).

Knowing this, it's a bit easier to understand why the unwrap_dec trick above works well: the type check that ensures that eq_refl has type is_some opt = true is essentially computing the head-normal form of opt and comparing it to Some, but it does that using Coq's fast reduction tactics. In fact, Jason has done a lot of work on exploring alternative strategies that combine reflection and fast full-reduction tactics such as vm_compute or native_compute to give fine-grained control over reduction.

Note

Jason correctly points out that this notation won't give you great error messages if you pass it terms with typos:

Compute (unwrap_tac (Som 1)).The variable opt was not found in the current
environment.

One way around this is to tweak the notation to force it to typecheck its argument before passing it into the tactic, like this:

Notation unwrap_tac' opt :=
  (match opt with _ =>
   ltac:(match (eval hnf in opt) with
         | Some ?v => exact v
         | ?other => fail "Error:" other "isn't [Some _]"
         end) end) (only parsing).

Compute (unwrap_tac' (Som 1)).The reference Som was not found in the current
environment.

Using a match instead of another construct like let _ := opt in … ensures that we don't pollute the term (the match will self-reduce without requiring an explicit reduction).