I came across Structure and Interpretation of Computer Programs (SICP) a couple of years ago.
By then I had been working as a coder for some years - most of it as a not too knowledgeable but self-confident Java enterprise developer, mastering various enterprise frameworks. Although I spent the last two years of that time learning Scala, and trying to forget where I’d come from. This was the first time I was somewhat exposed to functional programming.
I had some understanding of what functional programming means (not that I have significantly more now), and of the relevance of things like referential transparency or functions being values. But to me SICP was a kind of revelation, and for one particular reason. I mean it was a revelation in many different ways of course: it deepened my understanding of object oriented programming, mutable state, and thanks to it, these puzzle pieces all started to form one big picture to me. Just read Brian Harvey’s praise about why SICP is important and fundamental. But still the most important aspect of it was the idea of computation as data.
I have to mention here, that I didn’t actually read the book, but I watched the recording of the MIT course from 1986 on Youtube. SICP at that time was the introductory computer science course at MIT, tutored by Harold Abelson and Gerald Jay Sussman, the two authors of the book. I highly recommend the course itself, and not only because of the material and the genial presentation of it, but also because it is a hilarious epitome of the 80’s - just look at those clothes, glasses and hairstyles in the audience.
In the last couple of months I have been bumping into this idea again and again, in blog posts, conference talks (last time at this year’s flatMap conference in Oslo), so in this post I’m attempting to pick breadcrumbs dropped by other people, much smarter than me, and recite the little that I’ve managed to grasp of the concept of computation as data - mainly its how’s. And I will focus on one method in particular, that supports this concept, and that I have kept coming across recently: the free monad. I am still a not too knowledgeable coder, so beware that it is easily possible that some or all of this is misunderstood or misinterpreted. Also I haven’t even tried it yet (“in production”), it is still more like a fragile idea, just taking shape in my head, supported by matchsticks here and there. Actually writing down everything that I understand of free monads and their relation to the idea of computation as data, will help me a lot, and in case I’m getting something wrong, you can correct me in comments below.
Computation as data
The key idea here is basically the separation of the data, describing a computation (the representation of the computation), from the actual execution of that computation, when it gets interpreted. The data only captures the structure of the computation, without any meaning assigned to it yet, open to interpretation. It is the interpreter that executes it, by assigning meaning.
Let me give an example:
The elements of this computation are three values and two applications of a (binary) operation. The structure of it is:
This is the structure of the computation, independently of what meaning we assign to
+ - it could be simple addition of integers, but it could also be some other operation on objects referenced by those numbers, or whatever. So what could be a language that lets us build computations like this one? In Scala we could try an algebraic data type, like:
And we could represent the computation above as the following abstract syntax tree:
An interpreter of this language is a function from expressions written in the language, to some values:
The choice of the return type (i.e. the lower-level language that our interpreter interprets to) being
Int is somewhat arbitrary, and we can abstract over it. But in order to do that, we need additional parameters: a function that transforms
Int values to values of our return type, and another that gives meaning to the addition operation on values of the return type. With the addition of these, it becomes quite obvious how our interpreter is actually a fold:
A fold is an operation which replaces every node of an abstract syntax tree with a function application. Or, in other words, fold replaces each instruction of a program, with instructions of another language. Just think of
foldRight on a singly linked list, which takes a zero value and a function, and replaces every cons in the list with the application of that function, and nil with the zero value. This is actually a very important point, and I will get back to it very soon.
But first, we have another opportunity here for abstraction, as the type of values wrapped in
Value seems arbitrary too - it could be anything, not just
Int -, so let’s abstract over this. We need to modify our data type first, by adding a type parameter to it:
And thus our interpreter changes too:
Now we have an interpreter that not only can be used for our original example:
But we can easily reuse it for example for interpreting abstract syntax trees representing string concatenation computations too:
It happens to be very similar to the
Int case (it also uses the + sign), but the generic pattern is still visible.
It’s not a coincidence that I’ve chosen these examples: both addition with integers, and concatenation with strings form a monoid. A monoid is a structure that consists of a set (strings or integers in the above examples), an associative binary operation on the elements of the set (addition or concatenation in our examples), and a special unit or identity element, which we call here zero, governed by the identity law (x + zero = zero + x = x). The structure that has no identity element, but only the other two, is called a semigroup.
Let’s extend our data type to cover all monoid operations. In order to do that, we first need to add another case to it:
And also we need to adjust our
fold accordingly. But before we do that, let’s rearrange this data type a little bit, using the associativity and identity laws of monoids. The identity law means that zero elements can be safely inserted in or removed from a computation, without changing its meaning. And the associativity law lets us safely change the order of the steps, without changing the meaning, because:
If we append a zero at the end of every computation, and always associate everything to the right, then we can be sure that there’s always a value on the left side of an addition, and that there’s no value on the right side of it, but only another addition or a zero. So let’s turn the original example from:
and the abstract syntax tree to:
This convention just made the
Value case of the data type obsolete. Let’s change it accordingly:
And now we can fix our
And finally, we need to adjust the two interpreters we defined earlier, by providing the zero elements:
Expr is an algebraic data type that is generic enough to represent the structure of any monoid computation. It is a so-called free object, specifically a free monoid. It is free, because it can turn any type
A into a monoid, without any constraints on that type. (It is not called free because you get a monoid for free, but you do get a monoid for free.)
But wait a second,
Expr is identical to:
fold is identical to
foldRight. A singly linked list is a free monoid. Meaning that a list is basically an abstract syntax tree that can represent any monoid computation:
The cons case of the free monoid represents a step in the computation, by consisting of a current value on the left, and the rest of the computation on the right. And the nil case represents the end of the computation. And, as I already mentioned above,
foldRight of a list, that replaces every
Cons with an application of a function, and the
Nil with some zero value, is the interpreter for this abstract syntax tree. It replaces every instruction in the AST with instructions of some other language.
Before we generalise this concept further, so that we can represent a much wilder variety of computations with a data type, let’s look at the connection between
List and monoids a bit closer. We can represent the monoid structure in Scala with the following type class:
To make it clear, the meaning of
B forming a monoid with some operation, is that there exists an instance of
Monoid[B], like an evidence. But it is
B that forms (or is) the monoid. For example integers form a monoid, and the evidence is:
If there exists an
A => B function into our monoid
B, then - as we discussed -
List[A] can represent any expression in
given the function:
That is what basically our
fold function stands for, and we can write another version of it, that makes it even more explicit (I changed the language from our type
Applying it to our earlier examples:
results in 10, that is the result of
3 + 2 + 5.
foldMap is an interpreter for the program
List(“the”, “id”, “monad”) in the language of
List[String], that interprets to the lower-level language of
Not only can be
foldMap implemented in terms of
fold, but it’s possible the other way round too, using the fact that
B => B functions (endomorphisms - functions that map from a type to the same type) form a monoid with function composition:
We can implement
foldRight like this:
curried method of the
add function turns it from
(A, B) => B to
A => B => B, where
B => B is an endomorphism, as is the return value of
foldMap in this case. If we pass zero to the return value, it returns the actual result.
Another important thing to note is that if we have a way to get from
A, then we also have a way to get there from
List[List[List[A]]] and any arbitrarily deep nesting of
Lists. We can collapse these lists starting from the inside. This is obvious from how a monoid is basically a way to collapse a list. We could also write this as we have a way to get from
Cons(a, Nil) provides a method for getting from
This can also be expressed by rewriting
foldRight without zero:
and passing our
Cons constructor to it. To make it more obvious that it just has the proper type, let me rewrite it a bit:
so after that:
results in a function
f that has the type
List[A] => List[A] => List[A], and feeding that function back to
results in a function
g of type
List[List[A]] => List[A] => List[A]. The pattern is clear: we can turn any arbitrarily deep nesting of
Lists into an endomorphism of
List[A]’s, and if we pass the empty list to the endomorphism, we get our flattened list.
This almost feels like another monoid, a “metamonoid”, where instead of having
A ’s and
B’s, we have (nested)
List[_]’s. The “metamonoid’s” associative operation could be the method of getting from
List[A], and zero could be the method of getting from
List[A]. Let’s call these
unit is the identity element for the
join operation, because it satisfies the identity law(s). And
join also obeys the associativity law, as it doesn’t matter in which order we collapse nested
This “metamonoid” structure can be generalised by abstracting over
List, and using a type constructor
F[_] (or more precisely functor): it is called a monad. (Functors are type constructors that have a map function to enable application of functions inside. In category theory, functors are structure preserving mappings from one category to another.) This way our objects are now functors, and our arrows are natural transformations, instead of simple functions.
Monads can be represented in Scala with a simple type class. (I must note here, that the name of
unit is unfortunate, as “unit” is a heavily overloaded term, and we are only going to use it when we focus on monad laws, and then it will get replaced with something more unique.):
and instances must obey these laws:
We can construct another monad combinator from these three, that makes these laws more obvious and resemblant of the monoid laws:
Using compose we can restate the laws:
There are many different monads, all with different meanings assigned to
unit. In case of
List, the monad of monoids,
join means that we can collapse a nested list by appending the inner lists,
unit means the we can cons a value with
map is needed by
foldMap, so that we can apply a mapping from
B. But these methods of the monad structure have different meanings for
Writer etc. Still they share the generic pattern of encoding various, potentially effectful computations in
Now the question is if we can find a data type, that is the same for monad, as what
List is for monoid: generic enough to represent the structure of any monadic computation, so that we can turn any type constructor into a monad. And of course, we are also looking for the equivalent of our
foldMap function, that can collapse this structure to some value, or in other words, interpret a computation represented by this AST to some other language.
Let’s follow the same steps as for monoid. In case of monad, we not only have some type
A, but also a type constructor
F[_], that we want to turn into a monad for free. So the type could be:
In case of the monoid we just turned its combinators into cases of an ADT, so let’s try the same for monad. As I mentioned, it would be unfortunate to use the name
Unit, and although we could instead use the conventional
Zero represented the end of the computation, i.e. the return case, but that’s not very unique either, so let’s just call it
Point now, for brevity:
For the next case, join, the convention seems to be using the name
Suspend, as this case basically represents a suspended computation, so I’m going to follow it. It could be represented by:
This represents the same kind of recursion that characterised our
Cons cases, just with functors. As
F[_] must be a functor, we don’t need a
Map case, we already have a method to apply functions inside
There is a way to get around this constraint of
F[_] having to be a functor, and build in the map feature into our data type, but for that I need to mention here, that monads are usually not defined in terms of
map, but more commonly by
bind. (Unit also runs by the name of return, pure, point, etc. Bind is called
flatMap in Scala.)
bind can be defined in terms of the other three:
But most of the time it’s done the other way round, and
map are defined in terms of
bind (if at all), which are conventionally the minimal set of monad combinators. We are using the name
point instead of
unit, as earlier:
So instead of having a
Suspend case for our
Free[F[_], A] data type, we add
Bind as a second case, that represents a (continued) step in the computation:
It is similarly recursive, and it actually resembles
Cons even more, by consisting of a current value on the left, and the rest of the computation on the right. And it doesn’t require
F[_] to be a functor. The free monad with the
Bind case is equivalent to the free monad with the earlier
Suspend case, but I won’t pretend to understand the nature of this equivalence. You can read Rúnar Bjarnason’s blog post explaining all this, including the CoYoneda (that can be used for turning any type constructor into a functor).
Let’s look at the final version of our data type:
We now need to construct an interpreter for AST’s built of
Bind’s, and similarly to how in the case of the free monoid, any computation in a monoid can be represented by a free monoid, given a function
A => B, which fact is expressed in the form of the
foldMap function, we can expect a similar
foldMap function here, that interprets a computation represented by the free monad into some other monad, given a natural transformation
F ~> G:
A natural transformation is just a mapping from one type constructor to the other:
and it is a central element of this whole concept. To show this, I’m going to give an example for a computation represented by the free monad, but before that, we need to quickly implement
bind of a
Free[F[_], A], so that it can be used as a monad. Here we must use the name
flatMap instead of
bind, because that’s what for comprehensions expect in Scala, and we implement these as methods of our data type. I’m also adding
foldMap here as a method, for convenience:
Seeing all these recursions, the question arises if all this is stack-safe. Rúnar Bjarnason has a paper with the title Stackless Scala With Free Monads, on the subject of a technique called trampolining, and its close connection to free monads, which technique is for coming around the limitations of tail call elimination in Scala. It’s basically about implementing
foldMap in a tail-recursive fashion. I’m not going to do that here, to keep it simple.
Finally we need a function that can turn any type constructor into an instance of
After having all these out of the way, we can start to build an example. Let’s create a simple language that can read values from the console, or lookup a string in some database by an id, or print something to the console:
A simple program of that language, that reads an id, then looks it up in the db, and finally prints the result, could be:
One of the big deals about monads is that this pattern is so ubiquitous, that languages tend to have special syntax for monadic computations, that lets us write out the binding of steps of computations as imperative-style sequences of instructions, that is very easy to read. That’s what for comprehensions are in Scala, or is Haskell’s do notation.
In our case, we don’t yield a value here, we are only concerned about the effects (printing to the console).
Output, thanks to the implicit
lift function, are automatically lifted into the
Free[Lang, A] monad, and the return value (of type
Free[Lang, Unit]) of this function represents the structure of this effectful computation. The abstract syntax tree looks something like:
foldMap, recursively peels this AST, layer by layer, like an onion. But it needs a natural transformation from our language to some other. As I said, this is a central piece of this concept. For simplicity, we choose the simplest possible monad to interpret to, the identity monad:
It is enough to demonstrate how we fold a free monad AST to a value of a particular monad, and it makes it easy to picture how this would work similarly with an
Future or whatever. Let’s implement the natural transformation from
To run the program, we just pass this natural transformation to
Category theory design pattern
We could create as many interpreters for this one program as we wish. We could create for example a separate test interpreter for unit testing the
lookupAndPrint method, that could test all the effects (like logging for example, if there was logging added to it), against some pre-calculated values. All without using mock libraries. We could create an interpreter that accumulates all
Outputs in a list of strings, or writes them to a web service etc.
This alone doesn’t sound like a huge win, as we can achieve something similar with simple Java interfaces and various implementations of these. The huge win is in the separation of these “framework” concerns (like i/o, concurrency etc.), from the “business logic”, that this way becomes pure, referentially transparent, easy to read and reason about, compositional and modular. Let me just quote Rúnar Bjarnason on these latter two, because he gave just the perfect description of these two concepts:
Modularity is the property, that your software can be separated into its constituent parts, and those parts can be reused independently of the whole, in novel ways that you did not anticipate, when you created those parts.
Compositionality is the property, that your software can be understood as a whole, by understanding its parts, and the rules governing the composition.
These two concepts are all about managing complexity, and that is by the way what SICP is all about too. I recommend this talk of Rúnar Bjarnason’s where these quotes are from, and you can watch it only from 20:34, if you are only interested in why functional programming is awesome.
In our example,
Output can be recombined in various ways, creating new programs, that can be easily understood through our understanding of these parts and the rules of combining them. We introduced a new language for a problem domain, or, if you consider how we created objects, operations on these objects, and laws governing these, we introduced an algebra. Let me put one more quote here, this one from Don Stewart, who describes this approach to writing complex software, as the “category theory design pattern”:
In my experience, almost all designs fall into the ‘compiler’ or ‘interpreter’ pattern, using a model of the data and functions on that data. That is, problem domains are represented as algebraic structures (objects as ADTs with functions over them), and software architectures are about mapping from one algebra to another. This is the “category theory” design pattern.
Separating structure and interpretation, and creating algebras that express structure as abstract syntax trees, which can be then interpreted, is a very natural way of applying this generic category theory design pattern. And the free monad is just one particular way of doing this, a convenience. We could have come up with our own ASTs, our own interpreters, and our own laws. The benefit of thinking in terms of things like monads, or using something like a free monad, is that the laws governing these structures provide guarantees, invariants that we don’t have to test for anymore, because these abstractions have already been proven (thank you, math!).
There’s this part in SICP about managing complexity with black-box abstraction, and also about layers of abstraction and data-abstraction barriers between these - for data representing computation, these layers are algebras or languages, with interpreters between them, interpreting from one layer of abstraction, to a lower one, eventually reaching the level of primitives and side-effects. After all, in
F ~> G,
G could be another free monad, that can be then interpreted into another language, and so on.
Our model only lets us writing programs in one single language, or use one single algebra (
Lang in the above example). It doesn’t make much sense to create one algebra for a whole application (i.e. extend
Lang with new cases), we would rather want to create separate algebras for separate concerns: an algebra for our database, one for logging, another for authentication etc. So we need a way to combine these separate languages, so that we can write programs using many of these at the same time. We want to use a
Auth[A] language for example, and so we need such a + operation, to compose complex languages of simpler ones. This is the coproduct, which in Scala can be represented by the
Either data type:
Using this, we could create data types like
Either[Lang, Either[Log, Auth]], which would basically be trees:
And we could lift this into Free somehow, so that we can use it as an AST.
But in practice this would mean having instructions in our language, like
Right(Right(Login(…))), and the interpreter would look funny too.
I’m not going to go into details of how to get around this problem (it’s some smart boilerplate library code, that makes it much easier to use and interpret combined languages), because there is an excellent talk by Rúnar Bjarnason, titled Composable Application Architecture with Reasonably Priced Monads, which he gave at the Scala Days conference in Berlin last year, that explains exactly this, and does it obviously much better than me. To be honest, this whole post is actually like a lousy transcript or recitation of the first half of his enlightening presentation, so I strongly recommend watching it. In fact you should have watched it in the first place, instead of reading all this.
Some further reading (or watching) materials:
- Alissa Pajer: Free All the Functors
- Noel Welsh: Free Monads Are Simple
- Noel Welsh: Deriving the Free Monad
- Noel Welsh: Monadic IO: Laziness Makes You Free
- Wouter Swierstra: Data types a la carte
- Rúnar Bjarnason: On Monoids
- Rúnar Bjarnason: More on Monoids and Monads
- Rúnar Bjarnason: Free Monoids and Free Monads
- Rúnar Bjarnason: Free Monads and the Yoneda Lemma
- Rúnar Bjarnason: Stackless Scala With Free Monads
- Rúnar Bjarnason: The Interpreter Pattern Revisited
- Rúnar Bjarnason: Composable Application Architecture with Reasonably Priced Monads
- Rúnar Bjarnason: Functional Programming is Terrible
- Brian Harvey: Why Structure and Interpretation of Computer Programs matters
- Oleg Kiselyov, Amr Sabry, Cameron Swords: Extensible Effects
- Noel Welsh and Dave Gurnell: Essential Interpreters (pre-release)
- Paul Chiusano, Rúnar Bjarnason: Functional Programming in Scala
- Harold Abelson, Gerald Jay Sussman: Structure and Interpretation of Computer Programs (video lectures)