lambda cube

2022-10-25 21:15:40 -07:00 · 2022-10-25 21:15:40 -07:00 · 014712c6c2
commit 014712c6c2
parent 971a2f151b
9 changed files with 678 additions and 0 deletions
--- a/content/posts/the-lambda-cube/index.md
+++ b/content/posts/the-lambda-cube/index.md
@ -0,0 +1,349 @@
+++
+date = "2022-10-25"
+draft = false
+path = "/blog/the-lambda-cube"
+tags = ["haskell"]
+title = "The Lambda Cube in Haskell: what construct do I need for this polymorphism?"
+++
+
+> *Note*: I [just rewrote][zola-ts] the syntax highlighter for my site's
+> generator, Zola, to use tree-sitter, and there are probably some
+> deficiencies. Let me know if there's anything going terribly wrong.
+
+[zola-ts]: https://github.com/lf-/zola/tree/tree-painter
+
+I was reading a beta copy of [Production Haskell] by Matt Parsons again, and a
+section stood out to me in the type level programming chapter (something I am
+trying to get better at): "Value Associations", discussing the varieties of
+functions in the language. It brought up the idea of the Lambda Cube. What
+the heck is a Lambda Cube and why haven't I heard of one before?
+
+[Production Haskell]: https://leanpub.com/production-haskell
+
+Let's [look it up on Wikipe][Lambda Cube]—uhh maybe not
+
+Wikipedia tells me to pursue other fields of study by its use of absolutely
+baffling jargon and a bunch of judgements. Apparently this article was so
+unusually baffling that someone put a note on it:
+
+> This article may need to be rewritten to comply with Wikipedia's quality
+> standards, as article uses pervasively inconsistent, confusing and misleading
+> terminology for basic concepts fundamental to the understanding of the
+> article's subject. You can help. The talk page may contain suggestions.
+
+Incidentally, I was recommended [Practical Foundations for Programming
+Languages][pfpl] as a textbook covering the use of judgements/sequents
+in type theory. It's pretty dense and the authors have a bad habit of defining
+symbols in the middle of paragraphs, but it is doable and useful for figuring
+out the notation in papers.
+
+[pfpl]: http://www.cs.cmu.edu/~rwh/pfpl/
+
+Here's my understanding of the Lambda Cube (my email is on [About] if you have
+any corrections to submit):
+
+[About]: /about
+
+The [Lambda Cube] is an idea from type theory about the possible combinations
+of ways a language can extend the simply typed lambda calculus which has `value
+-> value` functions.
+
+Beginning at the simply typed lambda calculus on one corner of the cube,
+there are three directions it can be extended, leading to eight combinations of
+extensions, one for each point of the cube:
+
+* [`value -> type`](#value-type) (values determining types), representing dependent types.
+* [`type -> value`](#type-value) (types determining values), representing type
+  classes and generic functions.
+* [`type -> type`](#type-type) (types determining types), representing type
+  level functions.
+
+[Lambda Cube]: https://en.wikipedia.org/wiki/Lambda_cube
+
+Looking at language features through the lens of the cube has made it easier to
+figure out which of the dizzying array of Haskell language features I might
+want to achieve a specific goal ~~in writing incomprehensible code that my
+coworkers will have to suffer through~~.
+
+In this post I'll go through somewhat-practical uses of features in the
+vicinity of each of the three axes.
+
+# `value -> type` {#value-type}
+
+Value to type functions represent dependent types.
+
+Haskell does not have dependent types, although Haskell programmers seem
+strangely willing to engage in so-called "hasochism" to pretend it does.
+Dependent types would allow the type of the result to depend on a *value* input
+to the function, constraining the caller to exactly the set of relevant output
+types. Generally, dependent types are the realm of languages that lean more
+toward theorem prover on the spectrum between general purpose languages and
+theorem provers, such as Agda, Idris, Coq, and so on, although the
+configuration language Dhall also has dependent types.
+
+That said, there are several features that together can be combined to produce
+things in the general vicinity of `value -> type` to allow constraining the
+result type of functions depending on input values, falling short of dependent
+types.
+
+## [GADTs]
+
+[Generalised algebraic data types (GADTs)][GADTs] are among the features that
+are useful for problems where you have to make types depend on values. I think
+of GADTs as a bridge allowing proofs made at compile time to be attached to
+values and recalled later.
+
+The following is a (very uninspired) example of the use of GADTs on
+[Peano numbers], a representation of the natural numbers as zero or the
+successor of a number. It demonstrates rejecting programs that try to call
+`predecessor` to get the number prior to zero, which does not exist:
+
+{{ codefile(path="Peano.hs", colocated=true, code_lang="haskell", hide=false) }}
+
+There is a [package on hackage][peano-hackage] for Peano numbers in a GADT, but
+a Peano number GADT is not necessarily as useful as one might initially think.
+One annoying downside of using a GADT is that it makes it harder to write
+functions potentially returning values using any of the constructors, since
+different constructors have different types.
+
+Consider the following function:
+
+```haskell
+fromInt :: Int -> Peano a
+fromInt 0 = Z
+fromInt n = S $ fromInt (n - 1)
+```
+
+The function does not type check, since it has different types in the two
+equations:
+
+```
+• Couldn't match type ‘a’ with ‘Zero’
+  Expected: Peano a
+    Actual: Peano Zero
+  ‘a’ is a rigid type variable bound by
+    the type signature for:
+      fromInt :: forall a. Int -> Peano a
+   |
+17 | fromInt 0 = Z
+   |             ^
+```
+
+If we wanted this to compile, we would have to hide the
+type variable somehow. Hmmm.
+
+The tool for "I want to hide this type variable" is an existential type.
+Existential types are so called because they mean "there exists this type. See?
+Here's a value of that type".
+
+Such a type for this would be `data SomePeano = forall a. SomePeano (Peano a)`
+which will hide the `a`, giving both arms the same type. That existential
+can equivalently be written in GADT notation, which may be somewhat easier to
+understand since its constructor's signature looks like a normal function
+signature: `data SomePeano where SomePeano :: forall a. Peano a -> SomePeano`
+
+[peano-hackage]: https://hackage.haskell.org/package/PeanoWitnesses
+[Peano numbers]: https://en.wikipedia.org/wiki/Peano_axioms#Addition
+[GADTs]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/gadt.html
+
+## [DataKinds]
+
+Another useful feature in making values kind of into types is [DataKinds],
+although DataKinds does not allow the functionality you might expect on the
+surface.
+
+A "kind" can be seen as the "type" of a type, in the same relationship as a
+value has to its type. Most data types you probably write are of kind `Type`
+(equivalently [spelled `*`][star-is-type]), representing a fully applied type such
+as `Maybe Bool`, `()`, `Int`, and so on. Partially applied types have arrows in
+their kind, just like partially applied functions have arrows in their type.
+For instance, `Maybe` has the kind `Type -> Type`.
+
+DataKinds turns *types* into kinds, and *data constructors* into types. If you
+have a type `Nat`, which has two constructors `Zero` and `Succ Nat`, then the
+kind of `Nat` is `Type`, the kind of `Zero` is `Nat`, and the kind of `Succ` is
+`Nat -> Nat`.
+
+One odd new piece of syntax here is the use of prefix apostrophe symbols. This
+is to denote that you're naming the constructor as a type, rather than the
+datatype. For instance, if you have `data Foo = Foo Int`, `'Foo` would refer to
+the constructor, having a kind `Int -> Foo`.
+
+Another neat thing that DataKinds do is turning natural numbers and strings
+into types which can be manipulated with the `GHC.TypeLits` and `GHC.TypeNats`
+libraries.
+
+<aside>
+
+Some clever folks at work wrote a library for non-empty strings with some
+maximum length, called [string-variants]. Its main data type is `NonEmptyText
+(n :: Natural)`. Combining DataKinds and
+[TypeOperators] allows the definition of a function:  
+`widen :: (1 <= n, n <= m) => NonEmptyText n -> NonEmptyText m`  
+allowing the conversion of a shorter string type to a
+longer one. They also wrote this function for concatenation with a bounded
+output size:  
+`(<>|) :: NonEmptyText n -> NonEmptyText m -> NonEmptyText (n + m)`.
+
+Pretty cool.
+
+</aside>
+
+[string-variants]: https://hackage.haskell.org/package/string-variants
+[TypeOperators]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_operators.html
+
+Let's see an example that someone might use in industry: decoding untyped data
+into the correct variant when the tag is known.
+
+I've written approximately this code in order to implement a settings system
+with data stored as `Map Tag Value`, where `Value` is some untyped value that
+can be parsed in different ways depending on the tag. I know what the tag is
+supposed to be at compile time, so I can use that information with a GADT to
+prove that only one variant is possible to be returned from the get-setting
+function. A typeclass is used to write a function turning the type-level tag
+into its respective value-level representation.
+
+{{ codefile(path="DataKinds.hs", colocated=true, code_lang="haskell", hide=false) }}
+
+I had a misconception while learning Haskell and baffled by all these fancy
+underdocumented features, each having « un soupçon de Type »:
+
+DataKinds do not let you pass around sum type variants' bodies. The following
+program does not compile.
+
+{{ codefile(path="WhatIThoughtDataKindsDid.hs", colocated=true, code_lang="haskell", hide=false) }}
+
+You can *sort of* do this! It's what the first DataKinds example above is
+doing, but it looks a little different than I initially imagined it would work.
+The DataKinds are used as a witness that a GADT is a particular variant, rather
+than being directly returned.
+
+[star-is-type]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/poly_kinds.html#extension-StarIsType
+
+[DataKinds]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/data_kinds.html#extension-DataKinds
+
+## [Hasochism]
+
+Is it really worth it? Just because you can does not mean you should. The
+[singletons][Hasochism] library allows for simulating dependent types with
+very bad ergonomics and many crimes. Thus, the question arises: *should you*?
+
+[Hasochism]: https://hackage.haskell.org/package/singletons
+
+
+# `type -> value` {#type-value}
+
+Turning types into values is something that we do every day. It represents the
+usual mechanisms for ad-hoc and parametric polymorphism: type classes and
+generic functions. If there is any conclusion to draw from this post, it's that
+a typeclass is a type to value function.
+
+After starting to use this mental model, I know to reach for them immediately
+when looking to dispatch something based on type.
+
+Functions take two kinds of arguments: type and value arguments. In the case of
+functions on type classes, the type argument selects *which value* it is by
+selecting the instance.
+
+This is most clearly illustrated with the use of [TypeApplications], where you
+can explicitly apply a type to a function (which, it stands to reason, is thus
+a type to value function). Let's use [`Bounded`][Bounded] as an example, with
+the `minBound` function:
+
+[Bounded]: https://hackage.haskell.org/package/base-4.17.0.0/docs/GHC-Enum.html#t:Bounded
+
+```
+-- minBound is a function from a type that's an instance of Bounded to a value
+-- of that type
+λ> :t minBound
+minBound :: Bounded a => a
+λ> :t minBound @Bool
+minBound @Bool :: Bool
+λ> minBound @Bool
+False
+```
+
+[TypeApplications]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_applications.html
+
+Realizing *this* made me write more type classes deliberately as functions from
+type to value.
+
+In Haskell and many languages, types are erased at runtime. All the fancy
+things you do with type families and so on go away. This is important to
+remember as it means that Haskell programs typically need programmer specified
+structures to bridge between types and values. What stays around is
+dictionaries, which are similar to vtables in Rust: tables of functions for
+some class. Values are accompanied by references to the relevant dictionaries.
+
+# `type -> type` {#type-type}
+
+Functions from type to type are the bread and butter of type level programming.
+They allow you to transform types in nearly arbitrary ways, allowing for
+better DSLs and creating very generic functions.
+
+The main facilities Haskell has for type to type functions are:
+
+## [MultiParamTypeClasses] plus [FunctionalDependencies]
+
+This method of writing type level functions is kind of Odd, since one of the
+"parameters" is actually the result type, and you have to put the middle parts
+of the function on the left hand side as constraints.
+
+It goes like so:
+
+```haskell
+class TypeOr (a :: Bool) (b :: Bool) (result :: Bool) | a b -> result
+
+instance TypeOr 'False 'False 'False
+instance TypeOr 'True a 'True
+instance TypeOr a 'True 'True
+-- >>> x = Proxy :: TypeOr 'True 'False r => Proxy r
+-- >>> :t x
+-- x :: Proxy 'True
+```
+
+Clearly, I did not suffer enough, so I wrote a simple program to concatenate
+two type level lists, breaking my brain in the process:
+
+{{ codefile(path="MPTCFundeps.hs", colocated=true, code_lang="haskell", hide=false) }}
+
+There is one benefit to this, which is that the compiler can infer types
+backwards through these. That is not really a good reason to do this to
+yourself though: you can just use [TypeFamilyDependencies] with a closed type
+family instead.
+
+[TypeFamilyDependencies]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_families.html#extension-TypeFamilyDependencies
+
+[MultiParamTypeClasses]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/multi_param_type_classes.html
+[FunctionalDependencies]: https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/functional_dependencies.html
+
+## OK but that's awful: using associated [TypeFamilies] instead
+
+That's not a question, but nevertheless you can use [associated type
+families][TypeFamilies] to achieve the same objective and the code makes way
+more sense:
+
+[TypeFamilies]: http://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_families.html
+
+{{ codefile(path="AssociatedTypes.hs", colocated=true, code_lang="haskell", hide=false) }}
+
+## Getting rid of the class: closed [TypeFamilies]
+
+What if you could not write the class at all? Turns out, the associated type
+was unnecessary (although it is useful to use an associated type if you *also*
+need type class features).
+
+Using a closed type family is definitely the nicest option ergonomically: it
+just looks like pattern matches:
+
+{{ codefile(path="TypeFamilies.hs", colocated=true, code_lang="haskell", hide=false) }}
+
+There's also the ugly duckling, the open type family, but it has the same
+limitations as an associated type family.
+
+# Conclusion
+
+I hope that this ten thousand foot overview is useful in giving a better mental
+model of which structure to reach for and when.
+
+Thanks to Hazel Weakly for reviewing the draft of this article.