Rewriting type theory

Posted by Jesper on October 30, 2019

This is the second in a series of three blog posts on rewrite rules in Agda. If you haven’t already, you can read the first post here. In that post, I gave several examples of how you can use rewrite rules in Agda to make your life easier and experiment with new extensions to type theory. This post goes into the nitty-gritty details of how to rewrite rules work in general, and how they interact with several other features of Agda.

Instead of starting with a full-fledged language like Agda, I first describe a small core language with the ability to declare new rewrite rules. After that, I’ll extend it with other features that you are used to from Agda, such as datatypes, record types with eta-equality, irrelevance, parametrized modules, implicit arguments and metavariables, and universe level polymorphism.

I apologize in advance if this post is a little dry compared to the previous one. However, I think it’s worth it to specify the algorithms we use in detail to give a deeper understanding of a feature. And when there’s ever some weird behaviour in the implementation of rewrite rules, the spec helps to determine whether it is a bug in the implementation or actually the intended semantics.

Rewriting type theory

Without further ado, let’s define our core rewriting type theory.

Syntax

The syntax has five constructors: variables, function symbols, lambdas, pi types, and universes.

\[ \begin{array}{lcll} u, v, A, B &::=& x\; \bar{u} & \text{(variable applied to arguments)} \\ & | & \fun{f}\; \bar{u} & \text{(function symbol applied to arguments)} \\ & | & \lambda x.\, u & \text{(lambda abstraction)} \\ & | & (x : A) \rightarrow B & \text{(dependent function type)} \\ & | & \ty{i} & \text{(\(i\)th universe)} \\ \end{array} \]

As in the internal syntax of Agda, there is no way to represent a \(\beta\)-redex in this syntax. Instead, substitution \(u\sigma\) is defined to eagerly reduce \(\beta\)-redexes on the fly.

Contexts are right-growing lists of variables annotated with their types.

\[ \begin{array}{lcll} \Gamma, \Delta &::=& \emp & \text{(empty context)} \\ & | & \Gamma (x : A) & \text{(context extension)} \\ \end{array} \]

Patterns \(p,q\) share their syntax with regular terms, but must satisfy some additional restrictions. To start with, the only allowed patterns are unapplied variables \(x\) and applications of function symbols to other patterns \(\fun{f}\; \bar{p}\). This allows us for example to declare rewrite rules like \(\fun{plus}\; x\; \con{zero} \rew x\) and \(\fun{plus}\; x\; (\con{suc}\; y) \rew \con{suc}\; (x + y)\).

Declarations

In this simple core language, there are two kinds of declarations: function symbols (corresponding to a postulate in Agda) and rewrite rules (corresponding to a postulate + a {-# REWRITE #-} pragma).

\[ \begin{array}{lcll} d &::=& \fun{f} : A & \text{(function symbol)} \\ & | & \forall \Delta. \fun{f}\; \bar{p} : A \rew v & \text{(rewrite rule)} \\ \end{array} \]

Since this is a blog post and not a full paper, I omit the typing and conversion rules. The only relevant bit are the rules for checking the validity of a rewrite rule:

The first restriction is necessary because otherwise reduction would introduce variables that are not in scope, breaking well-scopedness of expressions (if you do not agree this is reasonable, you’re reading the wrong blog). Without the second restriction, it would be easy to define rewrite rules that break type preservation (though just well-typedness of rewrite rules is not sufficient, see the next post in this series). It is possible to go without the third restriction, but in practice this would mean that the rewrite rule would never be applied.

Reduction and matching

To reduce a term \(\fun{f}\; \bar{u}\), we look all the rewrite rules of \(\fun{f}\) to see if any of them apply (we assume a global signature \(\Sigma\) containing all declarations).

\[ \ru{(\forall \Delta. \fun{f}\; \bar{p} : A \rew v) \in \Sigma \qquad [\match {\bar u} {\bar p}] \Rightarrow \sigma} {\fun{f}\; \bar u \red v\sigma} \]

Matching a term \(u\) against a pattern \(p\) produces (if it succeeds) a substitution \(\sigma\) and is written \([\match u p] \Rightarrow \sigma\). In contrast to the first-match semantics of clauses of a regular definition by pattern matching, all rewrite rules are considered in parallel, so there is no need for separate notion of a failing match.

\[ \begin{gather*} \ru{}{[\match u x] \Rightarrow [\subst u x]} \qquad % \ru{[\match {\bar u} {\bar p}] \Rightarrow \sigma}{[\match {\fun{f}\; \bar u} {\fun{f}\; \bar p}] \Rightarrow \sigma} \\[2ex] % \ru{u \red^* v \qquad [\match v p] \Rightarrow \sigma}{[\match u p] \Rightarrow \sigma} \\[2ex] % \ru{}{[\match \emp \emp] \Rightarrow []} \qquad % \ru{[\match u p] \Rightarrow \sigma_1 \qquad [\match {\bar u} {\bar p}] \Rightarrow \sigma_2}{[\match {u;\bar u} {p;\bar p}] \Rightarrow \sigma_1 \uplus \sigma_2} \end{gather*} \]

Matching a term against a pattern variable produces a substitution that assigns the given value to the variable, while matching an expression \(\fun{f}\; \bar u\) against a pattern \(\fun{f}\; \bar{p}\) recursively matches the arguments \(\bar{u}\) against the patterns \(\bar{p}\), combining the results of each match by taking the disjoint union \(\sigma_1 \uplus \sigma_2\). Matching can also reduce the term being matched, which means matching and reduction are mutually recursive.

Higher-order rewriting

With the basic set of rewrite rules introduced in the previous section, you can already declare a surprisingly large number of rewrite rules for first-order algebraic structures. From the examples of the previous post in this series, it handles all of example 1, rules map-fuse and map-++ from example 2, all of example 3, rule //-beta from example 4, rules catch-true, catch-false, and catch-exc from example 5, and the rules dealing with Bool in example 6.

Most of the examples that don’t work yet are excluded because they use λ and/or function types in the pattern of a rewrite rule. This brings us to the issue of higher-order rewriting. See also issue #1563 on the Agda bug tracker for more examples where higher-order rewrite rules are needed. Although the metatheory of higher-order rewrite systems is notoriously complex, merely implementing it is much easier.

To support higher-order rewriting, we extend the pattern syntax beyond variables and function symbols. The following patterns are supported:

During matching it is important to keep the (rigid) bound variables separate from the (flexible) pattern variables. For this purpose, the matcher keeps a list \(\Phi\) of all rigid variables. This list is not touched by any of the previous rules, but any variables bound by a \(\lambda\) or a function type are added to it.

\[ \begin{gather*} \ru{\Phi, x \vdash [\match u p] \Rightarrow \sigma}{\Phi \vdash [\match {\lambda x.\, u} {\lambda x.\, p}] \Rightarrow \sigma} \\[2ex] % \ru{\Phi \vdash [\match A p] \Rightarrow \sigma_1 \qquad \Phi, x \vdash [\match B q] \Rightarrow \sigma_2}{\Phi \vdash [\match {(x : A) \rightarrow B} {(x : p) \rightarrow q}] \Rightarrow \sigma_1 \uplus \sigma_2} \\[2ex] % \ru{x \in \Phi \qquad \Phi \vdash [\match {\bar u} {\bar p}] \Rightarrow \sigma}{\Phi \vdash [\match {x\; \bar u} {x\; \bar p}] \Rightarrow \sigma} \\[2ex] \end{gather*} \]

Note the strong similarity between the third rule and the rule for matching a function symbol \(\fun{f}\). This is not a coincidence: both function symbols and bound variables act as rigid symbols that can be matched against.

The rules above extend the pattern syntax to allow for bound variables in patterns, and allow for rules such as map-id (i.e. map (λ x → x) xs ≡ xs). However, they do not yet constitute true higher-order rewriting (such as in rules raise-fun, cong-Π, and cong-λ). For this we also consider pattern variables applied to arguments.

As is well known, allowing arbitrary patterns as arguments to pattern variables makes matching undecidable (!), so it is customary to restrict patterns to Miller’s pattern fragment, where pattern variables must be applied to distinct, bound variables. Here is the general rule for matching against a pattern variable in the Miller fragment:

\[ \ru{FV(v) \cap \Phi \subseteq \bar{y}} {\Phi \vdash [\match v {x\; \bar{y}}] \Rightarrow [\subst {(\lambda \bar{y}. v)} x]} \]

Since all the arguments of \(x\) are variables, we can construct the lambda term \(\lambda \bar{y}.\, v\). To avoid having out-of-scope variables in the resulting substitution, we check that the free variables in \(v\) are included in \(\bar{y}\).

Eta-equality and record types

If you’ve been paying close attention, you may have noticed a flaw in the matching for \(\lambda\)-patterns: it does not respect \(\eta\)-equality. With \(\eta\)-equality for functions, any term \(u : (x : A) \rightarrow B\; x\) can always safely replaced with \(\lambda x.\, u\; x\), so it should also match a pattern \(\lambda x.\, p\). In this case fixing the problem is not hard since we can \(\eta\)-expand on the fly whenever we match something against a \(\lambda\)-pattern:

\[ \ru{\Phi, x \vdash [\match {u\; x} p] \Rightarrow \sigma} {\Phi \vdash [\match u {\lambda x.\, p}] \Rightarrow \sigma} \]

Unfortunately this is not enough to deal with \(\eta\)-equality in general. It’s possible that the pattern is underapplied as well, e.g. when we match a term of type \((x : A) \rightarrow B\; x\) against a pattern \(\fun{f}\; \bar{p}\) or \(x\; \bar{p}\).

To respect eta equality for functions and record types, we need to make matching type-directed! We also need contexts with the types of the free and bound variables. Thus we extend the matching judgement to \(\Gamma;\Phi \vdash [\match {u : A} {p}] \Rightarrow \sigma\) where \(A\) is the type of \(u\) (note: not necessarily the same as the type of \(p\)!) and \(\Gamma\) and \(\Phi\) are now contexts of pattern variables and bound variables respectively.

The type information is used by the matching algorithm to do on-the-fly \(\eta\)-expansion of functions whenever the type is a function type:

\[ \ru{\Gamma;\Phi(x:A) \vdash [\match {u\; x : B} {p\; x}] \Rightarrow \sigma} {\Gamma;\Phi \vdash [\match {u : (x : A) \rightarrow B} p] \Rightarrow \sigma} \]

Here \(p\; x\) is only defined if the result is actually a pattern, otherwise the rule cannot be applied. We may also reduce the type:

\[ \ru{A \red^* A' \qquad \Gamma;\Phi \vdash [\match {u : A'} p] \Rightarrow \sigma} {\Gamma;\Phi \vdash [\match {u : A} p] \Rightarrow \sigma} \]

\(\eta\) for records

Agda has \(\eta\)-equality not just for function types, but also for record types. For example, any term \(u : A \times B\) is definitionally equal to \((\field{fst}\; u , \field{snd}\; u)\). Since \(\eta\)-equality of records is a core rule of Agda, we extend the matching algorithm to deal with it (see issue #2979 and issue #3335). Luckily, since we now have a type-directed matching algorithm, it is easy to handle this rule.

Let \(\fun{R} : \ty{i}\) be a record type with fields \(\field{\pi_1} : A_1\), \(\ldots\), \(\field{\pi_n} : A_n\). Since records can be dependent, each type \(A_i\) may depend on the previous fields \(\field{\pi_1}\), \(\ldots\),\(\field{\pi_{i-1}}\). We have the following matching rule:

\[ \ru{\Gamma;\Phi \vdash [\match {\field{\pi_i}\; u : A_i'} {\field{\pi_i}\; p}] \Rightarrow \sigma \quad (i=1\cdots n)}{\Gamma;\Phi \vdash [\match {u : \fun{R}} p] \Rightarrow \sigma} \]

In this rule, the type \(A_i'\) is equal to \(A_i[\subst {\field{\pi_1}\; u} {\field{\pi_1}}, \cdots, \subst {\field{\pi_{i-1}}\; u} {\field{\pi_{i-1}}}]\).

In the case where \(n=0\), this rule says that a term of the unit record type \(\fun{\top}\) (with no fields) matches any pattern. So the matching algorithm even handles the notorious \(\eta\)-unit types!

Non-linearity and non-patterns

Sometimes it is useful to define rewrite rules with non-linear patterns, i.e. where a pattern variable occurs more than once. As an example, this allows us to postulate an equality proof trustMe : (x y : A) → x ≡ y with a rewrite rule trustMe x x ≡ refl. This can be used in a similar way to Agda’s built-in primTrustMe. Another example where non-linearity is used is the rule transportR-refl from example 4 in my previous post (it also needs irrelevance for Prop, see issue #3525 for more details).

Non-linear matching is actually an instance of a more general pattern I call a non-pattern. A non-pattern is an arbitrary term that matches another term if the terms are definitionally equal. Non-patterns allow embedding of arbitrary terms inside a pattern, but they cannot bind any variables. So even though we allow non-patterns, each pattern variable used in the rewrite rule still has to occur at least once in a pattern position.

Non-patterns are similar to inaccessible patterns (aka dot patterns in Agda) used in dependent pattern matching, with the important difference that inaccessible patterns are assumed to match whenever the rest of the pattern does, while non-patterns have to be checked.

For both non-linear patterns and non-patterns, the matching algorithm needs to decide whether two given terms are definitionally equal. This means reduction and matching are now mutually recursive with conversion checking.

Describing the full conversion checking algorithm of Agda would take us too far for this blog post, so let’s assume we have a (type-directed!) conversion judgement \(\Gamma \vdash u = v : A\). We extend the matching algorithm to also output a set of constraints \(\Psi = \{ \Phi_i \vdash u_i \ceq p_i\; |\; i = 1\cdots n \}\). The new judgement form of matching is now \(\Gamma;\Phi \vdash [\match {v : A} p] \Rightarrow \sigma ; \Psi\) (somehow I always end up with these crazy judgements with way too many arguments). We extend the matching algorithm with the ability to postpone a matching problem:

\[ \ru{} {\Gamma;\Phi \vdash [\match {v : A} p] \Rightarrow []; \{ \Phi \vdash v \ceq p : A \}} \]

All other rules just gather the set of constraints, taking the union whenever matching produces multiple sub-problems. When matching concludes, we check that all constraints actually hold before applying the rewrite rule:

\[ \ru{\begin{array}{c}\fun{f} : \Gamma \rightarrow A \in \Sigma \qquad (\forall\Delta. \fun{f}\; \bar{p} : B \rew v) \in \Sigma \\ [\match {\bar u : \Gamma[\bar u]} {\bar p}] \Rightarrow \sigma;\Psi \qquad \forall (\Phi \vdash v \ceq p : A) \in \Psi.\; \Phi \vdash v = p\sigma : A \end{array} } {\fun{f}\; \bar u \red v\sigma} \]

When checking a constraint we apply the final substitution \(\sigma\) to the pattern \(p\) but not to the term \(v\) or the type \(A\). This makes sense because the term being matched does not contain any pattern variables in the first place (and neither does its type).

A quick note on the implementation side: to actually change Agda to make the matching depend on conversion checking took quite some effort. The reason for this difficulty was that reduction and matching are running in one monad ReduceM, while conversion was running in another monad TCM (short for ‘type-checking monad’). In the new version after my refactoring, the conversion checker is now polymorphic in the monad it runs in. This means the same piece of code implements at the same time a pure, declarative conversion checker and a stateful constraint solver. I think this usage of effect polymorphism is pretty cool, and I don’t know many other production-ready languages besides Haskell where this would be possible.

Rewriting datatypes and constructors

Another important question is how rewrite rules interact with datatypes such as Bool, Nat, and _≡_. Can we simply add rewrite rules to (type and/or term) constructors? It turns out the answer is actually a bit more complicated.

If we allow rewriting of datatype constructors, you could (for example) postulate an equality proof of type Nat ≡ Bool and register it as a rewrite rule. However, this would mean \(\con{zero} : \data{Bool}\), violating an important internal invariant of Agda that any time we have \(\con{c}\; \bar{u} : \data{D}\) for a constructor \(\con{c}\) and a datatype \(\data{D}\), \(\con{c}\) is actually a constructor of \(\data{D}\) (see issue #3846). This is very unsafe even for the low standards of rewrite rules. For this reason, it’s not allowed to have rewrite rules on datatypes or record types. Sorry!

For constructors of datatypes there is no a priori reason why they cannot have rewrite rules attached to them. This would actually be useful to define a ‘definitional quotient type’ where some of the constructors may compute.

Unfortunately, there is another problem: internally, Agda does not store the constructor arguments corresponding to the parameters of the datatype. For example, the constructors [] and _∷_ of the List A type do not store the type A as an argument. This is actually really important for efficiency! However it means that rewrite rules on constructors cannot match against arguments in those positions, or bind pattern variables in them.

Currently, Agda enforces this restriction by requiring that for a rewrite rule on a constructor, the parameters need to be fully general, i.e. they must be distinct variables (see issue #3211). But this is not quite satisfactory yet, as shown by issue #3538, so if anyone knows a better criterion it would be welcome!

Irrelevance and Prop

An important feature of Agda that is (in my opinion) underused is definitional irrelevance, which comes in the two flavours of irrelevant function types .A → B and the universe Prop of definitionally proof-irrelevant propositions.

For rewrite rules with irrelevant parts in their patterns, we do not want matching to ever fail because this would mean a supposedly irrelevant term is not actually irrelevant. However, it should still be allowed to bind a variable in an irrelevant position, since we might want to use that variable in (irrelevant positions of) the right-hand side (see issue #2300). This means in irrelevant positions we allow:

  1. pattern variables \(x\; \bar y\) where \(\bar y\) are all the bound variables in scope, and

  2. non-patterns \(u\) that do not bind any variables.

Because of irrelevance, neither of these will ever produce a mismatch.

Together with the ability to have non-linear patterns, this allows us to have cool rewrite rules such as transportR-refl : transportR P refl x ≡ x where transportR : (P : A → Set ℓ) → x ≐ y → P x → P y and x ≐ y is the equality type in Prop. The constructor refl here is irrelevant, so this rule does not actually match against the constructor refl! Instead, Agda checks that the two arguments x and y are definitionally equal, and apply the rewrite rule if this is the case.

Universe level polymorphism

Universe level polymorphism allows Agda programmers to write definitions that are polymorphic in the universe level of a type parameter. Since the type Level of universe levels is a first-class type in Agda, it interacts natively with rewrite rules: patterns can bind variables of type Level just as any other type. This allows us for example to define rewrite rules such as map-id that work on level-polymorphic lists.

The type Level also supports two operations lsuc : Level → Level and _⊔_ : Level → Level → Level. These operations have a complex equational structure: _⊔_ is associative, commutative, and idempotent, and lsuc distributes over _⊔_, just to name a few of the laws (the actual implementation of the normalization of levels is worth a look). This causes trouble for the matching used for rewrite rules: how would one even determine whether a given level matches a ⊔ b when _⊔_ is commutative? Issue #2090 and issue #2299 show some of the things that would go wrong. For this reason it is not allowed to have rewrite rules that match against lsuc or _⊔_ (i.e. expressions containing these symbols are treated as non-patterns).

This restriction on patterns of type Levels is seems reasonable enough, but it causes trouble for certain rewrite rules. In particular, it is often not satisfied by rewrite rules that match on function types — like the cong-Π rule we used in the encoding of observational type theory last time, or like the problem described in issue #3971. The problem is that if A : Set ℓ₁ and B : Set ℓ₂, then the function type (x : A) → B has type Set (ℓ₁ ⊔ ℓ₂), so there is no sensible position to bind the variables ℓ₁ and ℓ₂.

To allow rewrite rules such as cong-Π, we need to think a little out of the box. It turns out that in the internal syntax of Agda, function types (x : A) → B are annotated with the sorts of A and B. So the ‘real’ function type of Agda looks more like (x : A : Set ℓ₁) → (B : Set ℓ₂). This means that if we allow rewrite rules to bind pattern variables in these hidden annotations, we are saved! The matching rule for function types now becomes:

\[ \ru{\begin{array}{c} \Gamma;\Phi \vdash [\match {A : \ty{\ell_1}} p] \Rightarrow \sigma_1; \Psi_1 \qquad \Gamma;\Phi \vdash [\match {\ell_1 : \level} q] \Rightarrow \sigma_2; \Psi_2 \\ \Gamma;\Phi(x:A) \vdash [\match {B : \ty{\ell_2}} r] \Rightarrow \sigma_3; \Psi_3 \qquad \Gamma;\Phi [\match {\ell_2 : \level} s] \Rightarrow \sigma_4; \Psi_4 \end{array} } {\begin{array}{l} \Gamma;\Phi \vdash [\match {(x : A : \ty{\ell_1}) \rightarrow (B : \ty{\ell_2})} {(x : p : \ty{q}) \rightarrow (r : \ty{s})}] \\ \qquad \Rightarrow (\sigma_1 \uplus \sigma_2 \uplus \sigma_3 \uplus \sigma_4);(\Psi_1 \cup \Psi_2 \cup \Psi_3 \cup \Psi_4) \end{array} } \]

That’s quite a mouthful, but sometimes that’s the price you have to pay to talk about real systems rather than toy examples!

Metavariables and constraint solving

To automatically fill in the values of implicit arguments, Agda inserts metavariables as their placeholders. These metavariables are then solved during typechecking by the constraint solver. As a rule, whenever you add a new feature to Agda, eventually it will cause something to go wrong in the constraint solver. This is certainly the case for rewrite rules. I’m in no position to give a full account of Agda’s constraint solver here, but let me discuss the most important ways it is impacted by rewrite rules.

Blocking tags

To do proper constraint solving, we need to know when a reduction is blocked on a particular metavariable. Usually it is possible to point out a single metavariable, but this is no longer the case when rewrite rules are involved:

Currently, the Agda implementation uses only an approximation of the set of metavariables it encounters (i.e. only the first metavariable encountered). This is not too harmful because the current implementation of Agda is anyway quite generous in how often it retries to solve sleeping constraints. If in the future Agda would be changed to be more careful in letting sleeping constraints lie (which would be a good thing to do!), a more precise tracking of blocking metavariables would also be desirable.

Pruning and constructor-like symbols

When adding new rewrite rules, we also keep track of what symbols are constructor-like. This is important for the pruning phase of the constraint solver. For example, let’s say the constraint solver is trying to solve an equation \(X \ceq Y\; (\fun{f}\; x)\). Since the metavariable \(X\) does not depend on the variable \(x\), the constraint solver attempts to prune the dependency of \(Y\) on \(x\). If \(\fun{f}\) is a regular postulate without any rewrite rules, there is no way that \(Y\) could depend on \(\fun{f}\; x\) without depending on \(x\), so the dependency of \(Y\) on its first argument is pruned away. However, if there is a rewrite rule where \(f\) plays the role of a constructor — say a rule \(\fun{g}\; (\fun{f}\; y) \rew \con{true}\) — then the assignment \(X := \con{true}\) and \(Y := \lambda y.\, \fun{g}\; y\) is a valid solution to the constraint where \(Y\) does depend on its argument, so it should not be pruned away.

Parametrized modules and where blocks

In one sense, parametrized modules in Agda can be thought of as \(\lambda\)-lifting all the definitions inside the module: if a module with parameters \(\Gamma\) contains a definition of \(\fun{f} : A\), then the real type of \(\fun{f}\) is \(\Gamma \rightarrow A\). But this does not quite capture the intuition that definitions inside a parametrized module should be parametric in the parameters. So in a different sense module parameters are rigid symbols more alike to postulates than variables. For this reason, module parameters play a double role on the left-hand side of a rewrite rule:

This intuition of module parameters as rigid symbols also applies to Agda’s treatment of where blocks, which are nothing more than modules parametrized over the pattern variables of the clause (you can even give a name to the where module using the module M where syntax!) Here a ‘local’ rewrite rule in a where block should only apply for the specific arguments to the function that are used in the clause, not those of a recursive call (see issue #1652 for an example).

Conclusion

This post is my attempt at documenting all the weird interactions that come up when you try to integrate user-defined rewrite rules into a general-purpose dependently typed language. I also wanted to document the solutions that I came up with along the way. Of course, this is by no means the only way to do things, nor the best one. But at least it is a way that I found to work, so I thought it would be worth writing it down. I know at least some people who are using rewrite rules in exciting ways I didn’t anticipate, which makes me very glad!

This post was about the technology of how to add rewrite rules to Agda or similar languages. Using these rewrite rules is essentially building your own type theory, which means you have to do your own meta-theory to make sure everything is safe (for whatever notion of safe you like). But there are many use cases of rewrite rules that feel like they shouldn’t break any of the usual properties we expect from an Agda program. Can we carve out a usable fragment of rewrite rules that is guaranteed to do no harm? Stay tuned for the next and final post in this series!