349 lines
14 KiB
Markdown
349 lines
14 KiB
Markdown
+++
|
||
date = "2022-10-25"
|
||
draft = false
|
||
path = "/blog/the-lambda-cube"
|
||
tags = ["haskell"]
|
||
title = "The Lambda Cube in Haskell: what construct do I need for this polymorphism?"
|
||
+++
|
||
|
||
> *Note*: I [just rewrote][zola-ts] the syntax highlighter for my site's
|
||
> generator, Zola, to use tree-sitter, and there are probably some
|
||
> deficiencies. Let me know if there's anything going terribly wrong.
|
||
|
||
[zola-ts]: https://github.com/lf-/zola/tree/tree-painter
|
||
|
||
I was reading a beta copy of [Production Haskell] by Matt Parsons again, and a
|
||
section stood out to me in the type level programming chapter (something I am
|
||
trying to get better at): "Value Associations", discussing the varieties of
|
||
functions in the language. It brought up the idea of the Lambda Cube. What
|
||
the heck is a Lambda Cube and why haven't I heard of one before?
|
||
|
||
[Production Haskell]: https://leanpub.com/production-haskell
|
||
|
||
Let's [look it up on Wikipe][Lambda Cube]—uhh maybe not
|
||
|
||
Wikipedia tells me to pursue other fields of study by its use of absolutely
|
||
baffling jargon and a bunch of judgements. Apparently this article was so
|
||
unusually baffling that someone put a note on it:
|
||
|
||
> This article may need to be rewritten to comply with Wikipedia's quality
|
||
> standards, as article uses pervasively inconsistent, confusing and misleading
|
||
> terminology for basic concepts fundamental to the understanding of the
|
||
> article's subject. You can help. The talk page may contain suggestions.
|
||
|
||
Incidentally, I was recommended [Practical Foundations for Programming
|
||
Languages][pfpl] as a textbook covering the use of judgements/sequents
|
||
in type theory. It's pretty dense and the authors have a bad habit of defining
|
||
symbols in the middle of paragraphs, but it is doable and useful for figuring
|
||
out the notation in papers.
|
||
|
||
[pfpl]: http://www.cs.cmu.edu/~rwh/pfpl/
|
||
|
||
Here's my understanding of the Lambda Cube (my email is on [About] if you have
|
||
any corrections to submit):
|
||
|
||
[About]: /about
|
||
|
||
The [Lambda Cube] is an idea from type theory about the possible combinations
|
||
of ways a language can extend the simply typed lambda calculus which has `value
|
||
-> value` functions.
|
||
|
||
Beginning at the simply typed lambda calculus on one corner of the cube,
|
||
there are three directions it can be extended, leading to eight combinations of
|
||
extensions, one for each point of the cube:
|
||
|
||
* [`value -> type`](#value-type) (values determining types), representing dependent types.
|
||
* [`type -> value`](#type-value) (types determining values), representing type
|
||
classes and generic functions.
|
||
* [`type -> type`](#type-type) (types determining types), representing type
|
||
level functions.
|
||
|
||
[Lambda Cube]: https://en.wikipedia.org/wiki/Lambda_cube
|
||
|
||
Looking at language features through the lens of the cube has made it easier to
|
||
figure out which of the dizzying array of Haskell language features I might
|
||
want to achieve a specific goal ~~in writing incomprehensible code that my
|
||
coworkers will have to suffer through~~.
|
||
|
||
In this post I'll go through somewhat-practical uses of features in the
|
||
vicinity of each of the three axes.
|
||
|
||
# `value -> type` {#value-type}
|
||
|
||
Value to type functions represent dependent types.
|
||
|
||
Haskell does not have dependent types, although Haskell programmers seem
|
||
strangely willing to engage in so-called "hasochism" to pretend it does.
|
||
Dependent types would allow the type of the result to depend on a *value* input
|
||
to the function, constraining the caller to exactly the set of relevant output
|
||
types. Generally, dependent types are the realm of languages that lean more
|
||
toward theorem prover on the spectrum between general purpose languages and
|
||
theorem provers, such as Agda, Idris, Coq, and so on, although the
|
||
configuration language Dhall also has dependent types.
|
||
|
||
That said, there are several features that together can be combined to produce
|
||
things in the general vicinity of `value -> type` to allow constraining the
|
||
result type of functions depending on input values, falling short of dependent
|
||
types.
|
||
|
||
## [GADTs]
|
||
|
||
[Generalised algebraic data types (GADTs)][GADTs] are among the features that
|
||
are useful for problems where you have to make types depend on values. I think
|
||
of GADTs as a bridge allowing proofs made at compile time to be attached to
|
||
values and recalled later.
|
||
|
||
The following is a (very uninspired) example of the use of GADTs on
|
||
[Peano numbers], a representation of the natural numbers as zero or the
|
||
successor of a number. It demonstrates rejecting programs that try to call
|
||
`predecessor` to get the number prior to zero, which does not exist:
|
||
|
||
{{ codefile(path="Peano.hs", colocated=true, code_lang="haskell", hide=false) }}
|
||
|
||
There is a [package on hackage][peano-hackage] for Peano numbers in a GADT, but
|
||
a Peano number GADT is not necessarily as useful as one might initially think.
|
||
One annoying downside of using a GADT is that it makes it harder to write
|
||
functions potentially returning values using any of the constructors, since
|
||
different constructors have different types.
|
||
|
||
Consider the following function:
|
||
|
||
```haskell
|
||
fromInt :: Int -> Peano a
|
||
fromInt 0 = Z
|
||
fromInt n = S $ fromInt (n - 1)
|
||
```
|
||
|
||
The function does not type check, since it has different types in the two
|
||
equations:
|
||
|
||
```
|
||
• Couldn't match type ‘a’ with ‘Zero’
|
||
Expected: Peano a
|
||
Actual: Peano Zero
|
||
‘a’ is a rigid type variable bound by
|
||
the type signature for:
|
||
fromInt :: forall a. Int -> Peano a
|
||
|
|
||
17 | fromInt 0 = Z
|
||
| ^
|
||
```
|
||
|
||
If we wanted this to compile, we would have to hide the
|
||
type variable somehow. Hmmm.
|
||
|
||
The tool for "I want to hide this type variable" is an existential type.
|
||
Existential types are so called because they mean "there exists this type. See?
|
||
Here's a value of that type".
|
||
|
||
Such a type for this would be `data SomePeano = forall a. SomePeano (Peano a)`
|
||
which will hide the `a`, giving both arms the same type. That existential
|
||
can equivalently be written in GADT notation, which may be somewhat easier to
|
||
understand since its constructor's signature looks like a normal function
|
||
signature: `data SomePeano where SomePeano :: forall a. Peano a -> SomePeano`
|
||
|
||
[peano-hackage]: https://hackage.haskell.org/package/PeanoWitnesses
|
||
[Peano numbers]: https://en.wikipedia.org/wiki/Peano_axioms#Addition
|
||
[GADTs]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html
|
||
|
||
## [DataKinds]
|
||
|
||
Another useful feature in making values kind of into types is [DataKinds],
|
||
although DataKinds does not allow the functionality you might expect on the
|
||
surface.
|
||
|
||
A "kind" can be seen as the "type" of a type, in the same relationship as a
|
||
value has to its type. Most data types you probably write are of kind `Type`
|
||
(equivalently [spelled `*`][star-is-type]), representing a fully applied type such
|
||
as `Maybe Bool`, `()`, `Int`, and so on. Partially applied types have arrows in
|
||
their kind, just like partially applied functions have arrows in their type.
|
||
For instance, `Maybe` has the kind `Type -> Type`.
|
||
|
||
DataKinds turns *types* into kinds, and *data constructors* into types. If you
|
||
have a type `Nat`, which has two constructors `Zero` and `Succ Nat`, then the
|
||
kind of `Nat` is `Type`, the kind of `Zero` is `Nat`, and the kind of `Succ` is
|
||
`Nat -> Nat`.
|
||
|
||
One odd new piece of syntax here is the use of prefix apostrophe symbols. This
|
||
is to denote that you're naming the constructor as a type, rather than the
|
||
datatype. For instance, if you have `data Foo = Foo Int`, `'Foo` would refer to
|
||
the constructor, having a kind `Int -> Foo`.
|
||
|
||
Another neat thing that DataKinds do is turning natural numbers and strings
|
||
into types which can be manipulated with the `GHC.TypeLits` and `GHC.TypeNats`
|
||
libraries.
|
||
|
||
<aside>
|
||
|
||
Some clever folks at work wrote a library for non-empty strings with some
|
||
maximum length, called [string-variants]. Its main data type is `NonEmptyText
|
||
(n :: Natural)`. Combining DataKinds and
|
||
[TypeOperators] allows the definition of a function:
|
||
`widen :: (1 <= n, n <= m) => NonEmptyText n -> NonEmptyText m`
|
||
allowing the conversion of a shorter string type to a
|
||
longer one. They also wrote this function for concatenation with a bounded
|
||
output size:
|
||
`(<>|) :: NonEmptyText n -> NonEmptyText m -> NonEmptyText (n + m)`.
|
||
|
||
Pretty cool.
|
||
|
||
</aside>
|
||
|
||
[string-variants]: https://hackage.haskell.org/package/string-variants
|
||
[TypeOperators]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_operators.html
|
||
|
||
Let's see an example that someone might use in industry: decoding untyped data
|
||
into the correct variant when the tag is known.
|
||
|
||
I've written approximately this code in order to implement a settings system
|
||
with data stored as `Map Tag Value`, where `Value` is some untyped value that
|
||
can be parsed in different ways depending on the tag. I know what the tag is
|
||
supposed to be at compile time, so I can use that information with a GADT to
|
||
prove that only one variant is possible to be returned from the get-setting
|
||
function. A typeclass is used to write a function turning the type-level tag
|
||
into its respective value-level representation.
|
||
|
||
{{ codefile(path="DataKinds.hs", colocated=true, code_lang="haskell", hide=false) }}
|
||
|
||
I had a misconception while learning Haskell and baffled by all these fancy
|
||
underdocumented features, each having « un soupçon de Type »:
|
||
|
||
DataKinds do not let you pass around sum type variants' bodies. The following
|
||
program does not compile.
|
||
|
||
{{ codefile(path="WhatIThoughtDataKindsDid.hs", colocated=true, code_lang="haskell", hide=false) }}
|
||
|
||
You can *sort of* do this! It's what the first DataKinds example above is
|
||
doing, but it looks a little different than I initially imagined it would work.
|
||
The DataKinds are used as a witness that a GADT is a particular variant, rather
|
||
than being directly returned.
|
||
|
||
[star-is-type]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/poly_kinds.html#extension-StarIsType
|
||
|
||
[DataKinds]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/data_kinds.html#extension-DataKinds
|
||
|
||
## [Hasochism]
|
||
|
||
Is it really worth it? Just because you can does not mean you should. The
|
||
[singletons][Hasochism] library allows for simulating dependent types with
|
||
very bad ergonomics and many crimes. Thus, the question arises: *should you*?
|
||
|
||
[Hasochism]: https://hackage.haskell.org/package/singletons
|
||
|
||
|
||
# `type -> value` {#type-value}
|
||
|
||
Turning types into values is something that we do every day. It represents the
|
||
usual mechanisms for ad-hoc and parametric polymorphism: type classes and
|
||
generic functions. If there is any conclusion to draw from this post, it's that
|
||
a typeclass is a type to value function.
|
||
|
||
After starting to use this mental model, I know to reach for them immediately
|
||
when looking to dispatch something based on type.
|
||
|
||
Functions take two kinds of arguments: type and value arguments. In the case of
|
||
functions on type classes, the type argument selects *which value* it is by
|
||
selecting the instance.
|
||
|
||
This is most clearly illustrated with the use of [TypeApplications], where you
|
||
can explicitly apply a type to a function (which, it stands to reason, is thus
|
||
a type to value function). Let's use [`Bounded`][Bounded] as an example, with
|
||
the `minBound` function:
|
||
|
||
[Bounded]: https://hackage.haskell.org/package/base-4.17.0.0/docs/GHC-Enum.html#t:Bounded
|
||
|
||
```
|
||
-- minBound is a function from a type that's an instance of Bounded to a value
|
||
-- of that type
|
||
λ> :t minBound
|
||
minBound :: Bounded a => a
|
||
λ> :t minBound @Bool
|
||
minBound @Bool :: Bool
|
||
λ> minBound @Bool
|
||
False
|
||
```
|
||
|
||
[TypeApplications]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_applications.html
|
||
|
||
Realizing *this* made me write more type classes deliberately as functions from
|
||
type to value.
|
||
|
||
In Haskell and many languages, types are erased at runtime. All the fancy
|
||
things you do with type families and so on go away. This is important to
|
||
remember as it means that Haskell programs typically need programmer specified
|
||
structures to bridge between types and values. What stays around is
|
||
dictionaries, which are similar to vtables in Rust: tables of functions for
|
||
some class. Values are accompanied by references to the relevant dictionaries.
|
||
|
||
# `type -> type` {#type-type}
|
||
|
||
Functions from type to type are the bread and butter of type level programming.
|
||
They allow you to transform types in nearly arbitrary ways, allowing for
|
||
better DSLs and creating very generic functions.
|
||
|
||
The main facilities Haskell has for type to type functions are:
|
||
|
||
## [MultiParamTypeClasses] plus [FunctionalDependencies]
|
||
|
||
This method of writing type level functions is kind of Odd, since one of the
|
||
"parameters" is actually the result type, and you have to put the middle parts
|
||
of the function on the left hand side as constraints.
|
||
|
||
It goes like so:
|
||
|
||
```haskell
|
||
class TypeOr (a :: Bool) (b :: Bool) (result :: Bool) | a b -> result
|
||
|
||
instance TypeOr 'False 'False 'False
|
||
instance TypeOr 'True a 'True
|
||
instance TypeOr a 'True 'True
|
||
-- >>> x = Proxy :: TypeOr 'True 'False r => Proxy r
|
||
-- >>> :t x
|
||
-- x :: Proxy 'True
|
||
```
|
||
|
||
Clearly, I did not suffer enough, so I wrote a simple program to concatenate
|
||
two type level lists, breaking my brain in the process:
|
||
|
||
{{ codefile(path="MPTCFundeps.hs", colocated=true, code_lang="haskell", hide=false) }}
|
||
|
||
There is one benefit to this, which is that the compiler can infer types
|
||
backwards through these. That is not really a good reason to do this to
|
||
yourself though: you can just use [TypeFamilyDependencies] with a closed type
|
||
family instead.
|
||
|
||
[TypeFamilyDependencies]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_families.html#extension-TypeFamilyDependencies
|
||
|
||
[MultiParamTypeClasses]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/multi_param_type_classes.html
|
||
[FunctionalDependencies]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/functional_dependencies.html
|
||
|
||
## OK but that's awful: using associated [TypeFamilies] instead
|
||
|
||
That's not a question, but nevertheless you can use [associated type
|
||
families][TypeFamilies] to achieve the same objective and the code makes way
|
||
more sense:
|
||
|
||
[TypeFamilies]: http://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_families.html
|
||
|
||
{{ codefile(path="AssociatedTypes.hs", colocated=true, code_lang="haskell", hide=false) }}
|
||
|
||
## Getting rid of the class: closed [TypeFamilies]
|
||
|
||
What if you could not write the class at all? Turns out, the associated type
|
||
was unnecessary (although it is useful to use an associated type if you *also*
|
||
need type class features).
|
||
|
||
Using a closed type family is definitely the nicest option ergonomically: it
|
||
just looks like pattern matches:
|
||
|
||
{{ codefile(path="TypeFamilies.hs", colocated=true, code_lang="haskell", hide=false) }}
|
||
|
||
There's also the ugly duckling, the open type family, but it has the same
|
||
limitations as an associated type family.
|
||
|
||
# Conclusion
|
||
|
||
I hope that this ten thousand foot overview is useful in giving a better mental
|
||
model of which structure to reach for and when.
|
||
|
||
Thanks to Hazel Weakly for reviewing the draft of this article.
|