Jesper Cockx - Ten improvements to Agda's implementation

Posted by Jesper on January 1, 2024

Happy 2024! I have (foolishly) resolved to write one blog post per month in this new year, so expect more frequent and less polished posts in the coming year. To give myself a head start, here is the first of twelve blog posts.

To the surprise of no-one who has used Agda, there are many things that could be improved. However, it is easy to get lost in thinking about new features or bug fixes, and forget about more architectural changes that could improve the usability, maintainability, and efficiency of Agda. Hence here is a list of ten ways in which I believe the Agda language - as it currently exists - could be improved. I believe most of them would require a few weeks to a few months of dedicated developer time to implement (though as always take these estimations with a large grain of salt) and would be very worthwhile. Sadly, the days that I could just dedicate a few weeks to months to a refactoring seem to be over (though never say never), but if you are interested to work on any of these please reach out and I will help you to get started!

1. A modern API for IDE interaction

Agda’s interactive mode has always been one of its important and unique features, and still most other languages could learn a thing or two from how interactive case splitting works in Agda. However, the evolution of IDEs hasn’t stopped since and nowadays many things feel outdated (e.g. having to load a file manually) or are simply missing (e.g. automatic import management). Moreover, the way Agda’s interaction works conflicts with the LSP (which didn’t exist yet when Agda was released), making it very cumbersome to add support for Agda to more editors. An updated interaction model for Agda could make it easier to integrate Agda into more IDEs and to add new features that developers expect in 2024.

2. A hygienic and efficient reflection API

Agda’s reflection API is quite powerful and has a lot of potential for letting users implement their own tactics and interactive commands. However, it is being held back by its steep learning curve, its poor representation of variables, and its abysmal impact on performance. A more accessible and efficient implementation would open up many doors for libraries to provide more automation and convenience to their users.

3. Better diagnostics for unsolved metavariables

Many of the errors you get while writing Agda code have to do with unsolved metavariables and constraints. Currently it is a true art to track down these unsolved metavariables to the point in the code where they originated. Keeping track of the origins of metavariables and showing them to the user in a sensible format would go a long way towards giving concrete diagnostics for locating the source of these unsolved metavariables.

4. Discrimination trees and memoization for instance search

The current implementation of instance arguments in Agda uses a lookup table based on the head symbol of the return type of each instance candidate, but otherwise uses a linear search through all possible candidates. Worse, filtering out invalid candidates currently requires saving and rolling back the whole type-checking state of Agda, which is obviously not ideal for performance. A more rational implementation of instance search would use discrimination trees for narrowing down the valid instances, and use memoization of candidates during recursive instance search to avoid exponential or even non-terminating instance search.

5. Let bindings in the internal syntax

The internal syntax of Agda does not allow for representation of let-bindings or beta-redexes. This has some advantages for the parts of Agda working on the internal syntax, but comes with a heavy downside for efficiency, as all let-bindings and beta-redexes have to be inlined during elaboration to internal syntax. This destroys sharing and makes some error messages frustratingly difficult to read. A proper representation of let-bindings in the internal syntax would solve both problems.

6. A typeable internal representation of case trees

The internal representation of case trees in Agda has really annoyed me for a while, specifically how it deals with variables:

Arguments to the case tree are never explicitly introduced with a lambda or something similar, instead case trees are assumed to always take as many arguments as allowed by their type.
In contrast to the rest of Agda’s codebase, variables in a case tree are represented using de Bruijn levels instead of de Bruijn indices.
Dot patterns are counted as variables in the context, even though they are never used in the RHS.

Switching to a representation that fits better with the rest of Agda (using de Bruijn indices and explicitly introducing variables) would make it easier to double-check Agda’s internal syntax and to implement backends for typed languages.

7. Proper internal representation of `with` and `rewrite`

Left-hand side gadgets such as with and rewrite can be very useful to write Agda programs with fancy types without condemning yourself to writing excruciatingly complex helper functions. However, their current implementation is mere syntactic sugar for helper functions that Agda generates internally. That’s a sensible idea in principle but in practice it makes the feature very fragile and unpredictable. Having a proper notion of with in the internal representation of functions would make error messages easier to understand and the implementation easier to maintain.

8. Parametrized modules without lambda-lifting

The current implementation of parametrized modules in Agda uses lambda-lifting, i.e. internally module parameters are just extra arguments to each function call. When you open a module with specific parameters given, this means that those parameters are duplicated all through the internal syntax, causing slowdowns (sometimes exponentionally) and making it difficult to pretty-print terms in a way that the user recognizes. Having a proper notion of module parameters would make the implementation more efficient and easier to maintain.

(Yes, I realize that the last few improvements are basically “preserve more of the surface syntax in the internal syntax”, which has the disadvantage of increasing the size and complexity of the internal syntax. But I believe the tradeoff is worth it, as demonstrated by the many issues we’re having with the current desugarings.)

9. A clear separation between constraint solving and conversion checking

This might be surprising to those familiar with the implementation of the Rocq Prover (formerly known as Coq), but Agda has a single implementation of conversion checking and unification for constraint solving. This has the advantage of reducing code duplication, but the big disadvantage that the logic of constraint solving gets tangled up with the logic of conversion checking. This is bad because conversion checking needs to be part of the trusted core, but constraint solving doesn’t. A clearer separation between the two implementations would improve maintainability and trustworthiness of Agda.

10. Reduced reliance on MTL-style type classes

The Agda implementation makes heavy use of MTL-style type classes (it defines 43 in total, in addition to the existing ones from mtl). This gives a great degree of control over abstractions and even enables effect polymorphism which we actually use in some cases. Sadly however, GHC does not seem to be well equipped to optimize code in this style, especially across modules. At the moment we mitigate this by relying on two compiler flags (-fexpose-all-unfoldings and -fspecialise-aggressively) but this leads to slow compile times, large binaries, and does not actually solve the problem in all cases. After a few discussions with the Agda developer team, I’ve become convinced that it would be better to move away from MTL-style type classes and towards a more lightweight way of enforcing some abstraction barriers that does not impact compilation or runtime as much.

Ten improvements to Agda's implementation