Back in 2008, Swierstra's Functional Pearl Data Types a la Carte showed how to construct the following data structure:
data Expr = Val Int | Add Expr Expr
from simple and re-usable individual components:
newtype Val e = Val Int deriving Functor
data Add e = Add e e deriving Functor
type Expr = Fix (Val :+: Add)
(Fix
is available in the recursion-schemes
package and :+:
is available from GHC.Generics
)
The Good
This construction allows to write clean and re-usable modular code. For example we can implement evaluation of expressions like this:
class Functor f => Eval f where
evalAlgebra :: f Int -> Int
Fix expr) = evalAlgebra (fmap eval expr)
eval (
instance (Eval f, Eval g) => Eval (f :+: g) where
L1 x) = evalAlgebra x
evalAlgebra (R1 y) = evalAlgebra y
evalAlgebra (
instance Eval Val where
Val x) = x
evalAlgebra (
instance Eval Add where
Add x y) = x + y evalAlgebra (
The beautiful part, which makes this a functional pearl, is that the
Eval
instances of Val
and Add
are
usable not just for the Expr
type defined above, but also
for any other expression language which re-uses them, such as:
type Expr2 = Fix (Val :+: Add :+: Mul :+: Pow :+: Etc)
The Ugly
How would we represent an expression, such as 1 + 2
in
the type defined above?
The simple way to do it is
Fix (R1 (Fix (L1 (Val 1)) `Add` Fix (L1 (Val 2))))
.
The usages of Fix
, R1
and L1
are cumbersome, so to make things easier Swiestra showed how to write
the expression as val 1 `add` val 2
using an additional
type-class and lifting functions per constructor. This makes writing
terms convinient, but a few problems remain unsolved:
- For expression types with many possible constructors, we pay a performance penalty for this representation, when compared to a single "flat expression algebra"
Expr
'sShow
is very cumbersome
Bringing the benefits of Data Types a la Carte to simpler representations
A few years after the paper, in 2011, DefaultSignatures
were
added in GHC 7.2. These enable a more direct construction of
Expr
:
data Expr a = EVal (Val a) | EAdd (Add a)
deriving (Generic1, Functor, Eval)
Of note here is the derivation of Eval
(using
DeriveAnyClass
). Making Eval
derivable is a
simple matter of adding default method implementations in the
Eval
class along with two trivial instances for types from
GHC.Generics
:
class Functor f => Eval f where
evalAlgebra :: f Int -> Int
evalAlgebra :: (Generic1 f, Eval (Rep1 f)) => f Int -> Int
default= evalAlgebra . from1
evalAlgebra
deriving newtype instance Eval f => Eval (M1 i c f)
deriving newtype instance Eval f => Eval (Rec1 f)
This gives us a few benefits:
- No performance penalty when we have many constructors
- Chains of
L1
andR1
are replaced with a single constructor with a suitable name, soShow
is slightly more sensible and we can write a term by hand without checking for the order of the constructors
The Bad
While Data Types a la Carte allows us to re-use individual components
such as Val
and Add
in expression types, those
are limited to be simple recursive types.
In practice, programming language ASTs tend to consist of multiple mutually-recursive types, and Data Types a la Carte's approach can't help us express those. I'll expand on how to extend its approach for more complicated ASTs in a future post.
Discussion: