blog/content/posts/nix-evaluation-blocking.md

+++
date = "2023-03-30"
draft = true
path = "/blog/nix-evaluation-blocking"
tags = ["haskell", "nix"]
title = "Stopping evaluation from blocking in Nix"
+++

Nix has a feature called "import from derivation", which is sometimes called
"such a nice foot gun" (grahamc, 2022). I can't argue with its usefulness; it
lets Nix do amazing things that can't be accomplished any other way, and avoid
pointlessly checking generated files into git. However, it has a dirty secret:
it can *atrociously* slow down your builds.

The essence of this feature is that Nix can do operations such as build
derivations whose result is used in the *evaluation stage*, before more
evaluation takes place.

## Nix build staging

It's a reasonable mental model to consider that Nix, in its current
implementation, can do one of two things at a given time:

* Evaluate: run Nix expressions to create some derivations to build. This stage
  outputs `.drv` files, which can then be instantiated.
* Build: given some `.drv` files, fetch the result from a binary cache or build
  from scratch.

Nix evaluation happens in a lazy order, and with neither concurrency nor
parallelism. When Nix is working on evaluating something that is blocked on a
build, it cannot go evaluate something else.

There are efforts [such as tvix][tvix] to rewrite the Nix evaluator so it can
simultaneously evaluate and build, but they are not shipped yet.

[tvix]: https://code.tvl.fyi/about/tvix

### How does import-from-derivation fit in?

Import-from-derivation (IFD for short) lets you do magical things: since Nix
builds can do arbitrary computation in any language, Nix expressions or other
data can be generated by external programs that need to do pesky things such as
parse cursed file formats such as cabal files.

Features like IFD are useful in build systems because they allow build systems
to be *Monadic*, in the words of the [Hadrian paper]: build things to decide
what to build. By comparison, an *Applicative* build system such as Bazel can
only take in things that are known statically; in Bazel it is thus common to
check in generated build instructions. This property in category theory is
illustrated in the type signatures of the operations of Monad and Applicative:

* `apply :: Applicative f => f (a -> b) -> a -> f b`<br/>
  which means that something *outside* the `f` box can be put into a
  computation `a -> b` inside the box.
* `bind :: Monad m => m a -> (a -> m b) -> m b`<br/>
  which means that if you have a value of type `a` in a `m` box, you can *get
  the `a` out of the box* and create a further computation in a box depending
  on its actual value.

  Import-from-derivation is `bind` for Nix builds: imagine that we
  wrote a derivation that generated the Nix language source for the `glibc`
  package definition, then we `import (make-glibc-definition someArgs)` in the
  dependencies of `hello`. To *evaluate* `hello`, we need to *build*
  `glibc-definition` to remove it from the box to put it into the `hello`
  definition, which means that IFD is equivalent to `bind`.

[Hadrian paper]: https://dl.acm.org/doi/10.1145/3241625.2976011

In order to achieve IFD, the evaluation stage can demand builds be run. This
would, by nature, block evaluating things that directly depend on building
something, but since Nix's evaluator can only work on one thing at a time, it
blocks everything else too.

N.B. I've heard that someone wrote a PureScript compiler targeting the Nix
language, which was then used to [parse a Cabal file to do cabal2nix's job][evil-cabal2nix]
entirely within Nix without IFD. Nothing is sacred.

[evil-cabal2nix]: https://github.com/cdepillabout/cabal2nixWithoutIFD


### What blocks evaluation?

Doing anything with a derivation that depends on the contents of its output,
for example:
* `import someDrv`
* `builtins.readFile "${someDrv}/blah"`

Any use of builtin fetchers blocks evaluation, which is why they are banned in
nixpkgs:
  * `builtins.fetchGit`
  * `builtins.fetchTree`
  * `builtins.fetchTarball`
  * `builtins.fetchurl`
  * etc

#### Builtin fetchers

Use of builtin fetchers is a surprisingly common problem of blocking in
evaluation. Sometimes it is done by mistake, but other times it is done for
good reason, with unfortunate tradeoffs. I think it's reasonable to use a
builtin fetcher plus IFD to import libraries such as nixpkgs, since
fundamentally the thing needs to be fetched for evaluation to proceed, but
other cases are more questionable.

One reason one might use the builtin fetchers is that there is no way to use a
fixed-output derivation such as is produced by `nixpkgs.fetchurl` to download a
URL without knowing the hash ahead of time.

An example I've seen of this being done on purpose is wanting to avoid
requiring contributors to have Nix installed to update the hash of some
tarball, since Nix has its own bespoke algorithm for hashing tarball contents
that nobody has yet implemented outside Nix. So the maintainers used an impure
network fetch (only feasible with a builtin), at the cost of slow Nix build
times. I wound up writing [gridlock] to serve the use case of needing to
maintain a lockfile of Nix compatible hashes without needing Nix, allowing the
use of fixed-output derivation based fetchers.

[gridlock]: https://github.com/lf-/gridlock

The reason that impure fetching needs to be a builtin is because Nix has an
important purity rule for derivations: either input is fixed and network access
is disallowed, or output is fixed and network access is allowed. In the Nix
model as designed ([content-addressed store] aside), derivations are identified
only by what goes into them, but not the output.

Let's see why that is. Assume that network access is allowed in normal
builders. If this were the case, builds could have arbitrarily different inputs
without changing the store path. Such an impurity would force Nix to either
rebuild the derivation every time since its actual inputs could arbitrarily
change at any time ([implemented as impure derivations!][impure-drvs]) or
following the popular strategy of assuming that it didn't change, making a
"clean build" necessary to have any confidence in the build output.

[impure-drvs]: https://nixos.org/manual/nix/stable/release-notes/rl-2.8.html

If one is doing an network fetch without knowing the hash, one needs to
first download the file to know its eventual store path. Therefore the
fetcher *has* to serialize any evaluation depending on the store path while
waiting for the download. Nix doesn't know how to resume evaluating something
else in the meantime, so it serializes all evaluation.

That said, it is, in my opinion, a significant design flaw in the Nix evaluator
that it cannot queue all the derivations that are reachable, rather than
stopping and building each in order.

[content-addressed store]: https://github.com/NixOS/rfcs/blob/master/rfcs/0062-content-addressed-paths.md

# Story: `some-cabal-hashes`

I worked at a Haskell shop which made extensive use of Nix. One day I pulled
the latest sources and yet again ran into the old bug in their Nix setup where
Nix would go and serially build derivations called
"`all-cabal-hashes-component-*`". For several minutes, before doing anything
useful. I decided enough was enough and went bug hunting.

I fixed it in a couple of afternoons by refactoring the use of
import-from-derivation to result in fewer switches between building and
evaluating, which I will expand on in a bit.

The result of this work is [available on GitHub][nix-lib].

[nix-lib]: https://github.com/lf-/nix-lib

### Background on nixpkgs Haskell

The way that the nixpkgs Haskell infrastructure works is that it has a
[Stackage]-based package set based on some Stackage Long-Term Support release,
comprising package versions that are all known to work together. The set is
generated via a program called [`hackage2nix`][hackage2nix], which runs
`cabal2nix` against the entirety of Hackage. What one gets in this package set
is a snapshot of Hackage when it was committed by the nixpkgs maintainers, with
both the latest version and the Stackage stable version of each package
available (if that's different from the latest).

`cabal2nix` is a program that generates metadata, input hashes, and wires
together a set of Nix derivations for Haskell packages automatically,
using Nix instead of Cabal to orchestrate builds of packages' dependencies.

Often the package set is for an older compiler, and while there *are* some
per-compiler-version overrides in the nixpkgs Haskell distribution, these may
not switch the version of a package you need, or the package set just plain has
too old a version.

In that case, you can use [overlays] which can apply patches, override sources,
introduce new packages, and do basically any other arbitrary modification to
packages.

Each package build will be provisioned with a GHC package database with
everything in its build inputs, using Nix exclusively for dependency
version resolution and avoiding Cabal doing it.

Since Nix orchestrates the build of the package dependency graph, caching of
dependencies for development shells, parallelism across package builds, remote
builds, and other useful properties simply fall out for free.

[Stackage]: https://www.stackage.org/
[hackage2nix]: https://github.com/NixOS/cabal2nix/tree/master/cabal2nix/hackage2nix
[overlays]: https://nixos.org/manual/nixpkgs/stable/#chap-overlays

### Where's the IFD?

nixpkgs Haskell provides a wonderfully useful function called `callCabal2nix`,
which executes `cabal2nix` to generate the Nix expression for some Haskell
source code at Nix evaluation time. Uh oh.

It also provides another wonderfully useful function called `callHackage`. This
is a very sweet function: it takes a package name and a version and gives you a Nix
derivation for that package which knows how to download it from Hackage.

Wait, how does that work, since you can't just download stuff for fun without
knowing its hash? Well, there's your problem.

"Figuring out hashes of stuff on Hackage" was solved by someone publishing a
comically large GitHub repo called `all-cabal-hashes` with hashes of all of the
tarballs on Hackage and CI to keep it up to date. Using this repo, you only
have to deal with keeping one hash up to date: the hash of the version of
`all-cabal-hashes` you're using, and the rest are just fetched from there.

#### Oh no

This repository has an obscene number of files in it, such that it takes dozens
of seconds to unpack it. So it's simply not unpacked. Fetching a file out of it
involves invoking tar to selectively extract the relevant file from the giant
tarball of this repo.

That, in turn, takes around 7 seconds on a very fast computer, for each and
every package, in serial. Also, Nix checks the binary caches for each and every
one, further compounding the oopsie.

In my fixed version, it takes about 7 seconds *total* to gather all the hashes,
and I think this is instructive in how a Nix expression can be structured to
behave much better.

## Making IFD go fast

Nix is great at building big graphs of dependencies in parallel and caching
them. So what if we ask Nix to do that?

How can this be achieved?

What if you only demand one big derivation be built with IFD then reuse it
across all the usage sites? You then only block evaluation once and can
appropriately use parallelism.

### Details of `some-cabal-hashes`

My observation was that hot-cache builds with a bunch of IFD are fine; it's
refilling it that's horribly painful since Nix spends a lot of time doing
pointless things in serial. What if we warmed up the cache by asking it to
build all that stuff in one shot? Then, the rest of the IFD would hit a hot
cache.

The fix, `some-cabal-hashes` and `some-cabal2nix`, is that Nix builds *one*
derivation with IFD that contains everything that will be needed for further
evaluation, then all the following imports will complete immediately.

Specifically, my build dependency graph now looks like:

```
                /-> cabal2nix-pkg1 -\
some-cabal2nix -+-> cabal2nix-pkg2 -+-> some-cabal-hashes -> all-cabal-hashes
                \-> cabal2nix-pkg3 -/
```

There are two notable things about this graph:

1. It is (approximately) the natural graph of the dependencies of the build
   assuming that the Nix evaluator could keep going when it encounters IFD.

2. It allows Nix to parallelize all the `cabal2nix-*` derivations.

Then, all of the usage sites are `import "${some-cabal2nix}/pkg1` or similar.
In this way, one derivation is built, letting Nix do what it's good at. Another
trick is that I made `some-cabal2nix` have no runtime dependencies by *copying*
all the resulting cabal files into the output directory instead of symlinking
them. Thus, the whole thing can be fetched from a cache server and not built at
all.

Figuring out what will be required is the other piece of the puzzle. Initially,
I wrote a drop in solution: extract the information from the overlays that were
actually using `callHackage`, without invoking `callHackage` and causing an
IFD! The way I did this is by making a fake version of the package set with
stubs for `callHackage`, `callCabal2nix` and such, then invoking the overlay
with this fake package set as the value of `super`. Once the overlays are
evaluated with these stubs, I extracted the versions requested by the overlay,
and used that to create the *real* `callHackage`.

This kind of trick is also used by maintainer scripts to extract metadata from
overlays that nixpkgs Haskell uses for compiler-specific overrides.

However, nixpkgs has something nicer for overriding versions of Hackage
libraries, which the published version of `some-cabal-hashes` imitates instead:
`haskell.lib.packageSourceOverrides`. It constructs an overlay that overrides
package versions with a nice clean input of what's required.

The last and very important optimization I did was to fix the `tar` invocation,
which was quite slow. `tar` files have a linear structure that is perfect for
making `t`ape `ar`chives (hence the name of the tool) which can be streamed to
a tape: one file after the other, without any index. Thus, finding a file in
the tarball takes an amount of time proportional to the number of files in the
archive.

If you call tar `invocations` times for the number of files you need, then it
does do `files-in-archive * invocations` work. However, if you call `tar`
*once* with a set of files that you want such that it can do one parse through
the file, then the overall execution time is just proportional to the number of
files in the archive. I can assume that is what `tar` actually does, since it
does the entire extraction in basically the same time with a long file list as
with one file.

I also found that if you use the `--wildcards` option, `tar` is extremely slow
and it seems worse with more files to extract, so it's better to use exact file
paths. If you need to get those for a tarball from GitHub, there's a pesky top
level directory that everything is below, which needs to be included in the
file paths; it can be found by using `tar tf` and grabbing the first line.

At extraction time, the extra directory name can be removed with
`--strip-components=1`, and using `--one-top-level` to dump everything below
the desired directory.

The source code for `some-cabal-hashes` is available here:
<https://github.com/lf-/nix-lib/blob/main/lib/some-cabal-hashes.nix>