Posted by Jesper on July 30, 2024

As I wrote in a previous blog post, agda2hs is a tool for producing verified and readable Haskell code by extracting it from a (lightly annotated) Agda program. In this blog post, I will dig a bit deeper into how agda2hs decides which parts of an Agda program to compile to Haskell, and which parts to drop. In the second part, I also want to dream a bit about how ornaments (or something like it) could help to reason in a principled way about some of the features of agda2hs that currently feel rather ad-hoc.

Erasure annotations in `agda2hs`

agda2hs relies heavily on erasure annotations (a.k.a. the 0 quantity from Quantitative Type Theory) to determine which parts of an Agda program should be preserved during compilation, and which parts are just “for show”, i.e. enforcing some kind of invariant that is not enforced statically on the Haskell side.

As an example, we can define a type of well-scoped syntax as follows:

data Term (@0 xs : List Name) : Set where
  TVar : (@0 x : Name) → Var x xs → Term xs
  TApp : Term xs → Term xs → Term xs
  TLam : (@0 x : Name) → Term (x ∷ xs) → Term xs

This translates to the following Haskell code:

data Term = TVar In | TApp Term Term | TLam Term

The big advantage we get out of these (erased) indices is that we can define functions that are guaranteed to respect the scope of terms by Agda’s type system, which are then translated to Haskell functions operating on this bare type of terms (which would be tricky to get right by hand). For example, a substitution function might have type

substTop : ∀ {@0 x xs} → Term xs → Term (x ∷ xs) → Term xs
substTop = ⋯

making it immediately obvious which term is substituted for which variable.

Using Agda’s erasure annotations for indicating which parts of the Agda program agda2hs should remove during its translation is nice for several reasons:

Agda ensures that we never accidentally use or pattern match on an argument that doesn’t appear in the generated Haskell code.
We can be very precise in which parts of the Agda code should (not) appear in the generated Haskell code, including for higher-order functions. For example, in the definition of if_then_else_ of agda2hs, the branches are provided erased proofs that the boolean condition is respectively True or False.
Since erasure annotations are part of the Agda type system, the information about which arguments of a type or function are erased is propagated automatically throughout the program.

Still, the fit is not perfect: agda2hs implements additional checks to ensure that not too many arguments are marked as erased. For example, the following datatype is perfectly valid Agda code

data Box (@0 a : Set) : Set where
  MkBox : a → Box a

but would translate to the following Haskell code with an out-of-scope type variable:

data Box = MkBox a

To avoid this problem, agda2hs requires that every type variable that appears in the generated Haskell code is not marked as erased.

Another subtler problem pops up if we allow erased type arguments to functions:

cast : {a b : Set} → @0 a ≡ b → a → b
cast refl x = x

which translates to the ill-typed Haskell

cast : a → b
cast x = x

The way agda2hs prevents this second example is by requiring additionally that no non-erased pattern variables are forced by pattern matching. In the example, the pattern match on refl forces the (implicit) pattern variable b to be equal to a, so the two rules together rule out this definition of cast.

These restrictions seem to do their job, but are hardly satisfactory. The root of the problem seems to be a mismatch between the intented use of erasure annotations and how agda2hs actually uses them: erasure annotations assume that the code is compiled to an untyped language and hence that all types can be safely erased, but agda2hs is compiling Agda code to a typed language. In other words, agda2hs uses erasure not to erase types, but merely to erase dependencies.

An interesting question would thus be whether we can have an alternative version of erasure annotations that do not assume by default that all types can be erased. A first step would be to change the typing rule for top-level type signatures so they cannot refer to erased things by default. But perhaps further changes are needed too. Alternatively, it would be interesting to investigate whether a different modality such as shape irrelevance would be a better fit for the purposes of agda2hs.

Erasing more than just arguments

Erasure annotations in Agda allow us to annotate arguments to functions, top-level definitions, parameters and indices to data types, arguments to constructors, and even specific constructors themselves. However, there are some things that we would like to annotate in agda2hs that go beyond the current capabilities of Agda. Let’s take a look at two examples.

Sometimes when working in a dependently typed language, we like to pair a value of some type a together with a proof of a property p of this value. Since we don’t want proofs to show up in our Haskell code, we mark it as erased:

record ∃ (a : Set) (@0 p : a → Set) : Set where
  constructor _⟨_⟩
  field
    value    : a
    @0 proof : p value
open ∃ public

For example, we could define the Var type from before as a natural number together with a proof that looking up the element of that position results in the given name:

--Var : Name → List Name → Set
Var x xs = ∃ Int λ n → xs !? n ≡ Just x

This is also known as a refinement type from systems such as Liquid Haskell, though Agda does not provide the same level of automation.

In the generated Haskell code, we would like values of this type to show up not as having type ∃ a, but simply as having type a. To make this translation type-preserving, constructor applications u ⟨ p ⟩ should be translated to u, and value x should be translated to x. We can accomplish this with a pragma marking ∃ as an unboxed record type (which is probably a bad name, suggestions for better names are welcome):

{-# COMPILE AGDA2HS ∃ unboxed #-}

The presence of unboxed types introduces an interesting prospect: we can have (potentially many) different Agda types that are all translated to the same Haskell type. But this also means that we can have functions that only manipulate the “proof part” of their input but leave the value intact. For example:

mapProof : ∀ {@0 p q : a → Set}
          → @0 (∀ {x} → p x → q x)
          → ∃ a p → ∃ a q
mapProof f (x ⟨ p ⟩) = x ⟨ f p ⟩

When compiled normally, this produces the rather boring definition mapProof x = x. What’s worse, sometimes functions like this need to do some pattern matching purely to put proofs in the right places. For example, we can attach a proof to a value nested in a Maybe as follows:

refineMaybe : ∀ {@0 p : a → Set}
            → (mx : Maybe a)
            → @0 (∀ {x} → mx ≡ Just x → p x)
            → Maybe (∃ a p)
refineMaybe Nothing f = Nothing
refineMaybe (Just x) f = Just (x ⟨ f refl ⟩)

This compiles to the rather unfortunate Haskell

refineMaybe :: Maybe a → Maybe a
refineMaybe Nothing = Nothing
refineMaybe (Just x) = Just x

To avoid having these identity functions showing up in the Haskell code, agda2hs allows you to mark a function as being “transparent”:

{-# COMPILE AGDA2HS mapProof transparent #-}
{-# COMPILE AGDA2HS refineMaybe transparent #-}

When checking a transparent function, agda2hs checks that it takes exactly one (non-erased) argument and that each clause returns this argument unchanged (after erasure). It then does not generate any Haskell code for this function, but instead replaces each application of this transparent function by its argument (and each lone appearance by id).

In my biased experience, unboxed record types and transparent functions are both very useful features of agda2hs, and we use them extensively in the implementation of Agda Core. However, they also feel like ad-hoc fixes to a deeper problem. A more fundamental solution would be to introduce a notion of ornaments to the language, and provide a mechanism to identify functions that only modify the “data-logic” of an argument but preserve the “data-structure”.

I am obviously speculating here, but I believe - at least for the purposes of agda2hs - ornaments would be at their most useful if they could also be integrated at the type level. Concretely, the type system should be aware of the structure of each type, and for each function whether it modifies the structure or just the logical parts (i.e. whether or not it is the identity function after erasing all proofs). How such a type system would work concretely, I leave as an exercise to the reader (or perhaps as a topic for a future blog post).

As always, feel free to send comments on the Agda Zulip, to @jesper@agda.club via any Fediverse portal, or email me at jesper@sikanda.be.

On erasure annotations and agda2hs

Erasure annotations in agda2hs

Erasing more than just arguments

Erasure annotations in `agda2hs`