drafts
This commit is contained in:
parent
09cee8447c
commit
d06e3ef9d9
3 changed files with 458 additions and 0 deletions
19
content/posts/extensions-to-the-nix-store.md
Normal file
19
content/posts/extensions-to-the-nix-store.md
Normal file
|
|
@ -0,0 +1,19 @@
|
||||||
|
+++
|
||||||
|
date = "2022-10-30"
|
||||||
|
draft = true
|
||||||
|
path = "/blog/extensions-to-the-nix-store"
|
||||||
|
tags = []
|
||||||
|
title = "Extensions to the Nix store"
|
||||||
|
+++
|
||||||
|
|
||||||
|
This post is a summary/index of the proposed new features in the Nix store,
|
||||||
|
since I keep repeatedly struggling to find the documents for them.
|
||||||
|
|
||||||
|
# Content addressability
|
||||||
|
|
||||||
|
## "intensional store"
|
||||||
|
|
||||||
|
This was introduced in section 6 in [Eelco's PhD thesis][phd-thesis], as an
|
||||||
|
opposite approach to the "extensional store"
|
||||||
|
|
||||||
|
[phd-thesis]: https://edolstra.github.io/pubs/phd-thesis.pdf
|
||||||
194
content/posts/fixing-up-pdfs.md
Normal file
194
content/posts/fixing-up-pdfs.md
Normal file
|
|
@ -0,0 +1,194 @@
|
||||||
|
+++
|
||||||
|
date = "2022-10-30"
|
||||||
|
draft = false
|
||||||
|
path = "/blog/workflow-pdfs"
|
||||||
|
tags = ["pdf"]
|
||||||
|
title = "My workflow: Managing and munging PDFs"
|
||||||
|
+++
|
||||||
|
|
||||||
|
Dealing with PDFs is something I do every day as someone working in software,
|
||||||
|
especially given that I tend toward both research and lower-level work where
|
||||||
|
papers and datasheets rule.
|
||||||
|
|
||||||
|
I think that the humble PDF is one of my favourite file formats besides text:
|
||||||
|
- You can give someone one and it will work
|
||||||
|
- Vectors work great in it
|
||||||
|
- Old files also just work
|
||||||
|
- Anywhere in the continuum between "digital-native output" to "a scan" can be
|
||||||
|
represented and worked with nicely
|
||||||
|
- Search is typically pretty great when you have the right document since they
|
||||||
|
tend to be *large* so CTRL-F can go very far
|
||||||
|
|
||||||
|
That said, "not being a text file" does sometimes make some tasks difficult,
|
||||||
|
metadata is often dubious, and I am usually drowning in a mountain of PDFs at
|
||||||
|
all times.
|
||||||
|
|
||||||
|
Most of the stuff described in this post can probably be done with Adobe
|
||||||
|
Acrobat, but it is not available for my computer. All of the tools described
|
||||||
|
below have packaging in the AUR or main repos on Arch and are not hard to run
|
||||||
|
on other operating systems.
|
||||||
|
|
||||||
|
# Fixing PDFs
|
||||||
|
|
||||||
|
There's several tools I regularly use for fixing up PDFs off the internet,
|
||||||
|
since it's unfortunately common that they come in with bad metadata, or in
|
||||||
|
other problematic forms.
|
||||||
|
|
||||||
|
## Page numbering
|
||||||
|
|
||||||
|
PDF supports switching page numbering midway through the document, for
|
||||||
|
instance, if the front-matter is numbered in Roman numerals and the main
|
||||||
|
content is in Arabic numerals. Too often, large PDFs that run across my desk
|
||||||
|
don't have this set up properly, so the page numbers are annoyingly offset.
|
||||||
|
|
||||||
|
You can fix this with the "page numbering" feature of [jPDF Tweak][jpdf-tweak].
|
||||||
|
|
||||||
|
[jpdf-tweak-manual](https://jpdftweak.sourceforge.net/manual/index.html)
|
||||||
|
|
||||||
|
## Document outline
|
||||||
|
|
||||||
|
PDF has a great feature called "document outline" or "bookmarks", which lets
|
||||||
|
you include the table of contents in searchable form that will show up in the
|
||||||
|
sidebar of good PDF viewers.
|
||||||
|
|
||||||
|
Unfortunately, many PDFs don't have these set up, which makes big documents a
|
||||||
|
hassle to work with as you have to jump back and forth between the table of
|
||||||
|
contents page and the rest of the document to find things. Fortunately, these
|
||||||
|
can be fixed.
|
||||||
|
|
||||||
|
There are three main tools that are useful for bookmarks hacking:
|
||||||
|
- [jPDF Tweak][jpdf-tweak], a multi-tool for doing various metadata hacking.
|
||||||
|
- [JPdfBookmarks], a powerful bookmarks-specific editor.
|
||||||
|
- [HandyOutliner], a small tool mostly useful to turn textual
|
||||||
|
tables of contents into bookmarks.
|
||||||
|
|
||||||
|
[jpdf-tweak]: https://jpdftweak.sourceforge.net/
|
||||||
|
[HandyOutliner]: https://handyoutlinerfo.sourceforge.net/
|
||||||
|
[JPdfBookmarks]: https://sourceforge.net/projects/jpdfbookmarks/
|
||||||
|
|
||||||
|
### Hyperlinked table of contents
|
||||||
|
|
||||||
|
This is the most convenient case: the author put in a hyperlinked table of
|
||||||
|
contents, but somehow the tooling didn't create a document outline. If this
|
||||||
|
happens, you can get a perfect outline with almost no work.
|
||||||
|
|
||||||
|
Use the "Extract links from current page and add them as bookmarks" button in
|
||||||
|
[JPdfBookmarks] to deal with this. It will do as it says: just grab all the
|
||||||
|
hyperlinks and turn them directly into a document outline.
|
||||||
|
|
||||||
|
This is great since generally the hyperlinks will have correct page positions
|
||||||
|
and so the outline will go to the right spot on the page in addition to going
|
||||||
|
to the right page.
|
||||||
|
|
||||||
|
### Textual table of contents
|
||||||
|
|
||||||
|
If you can cleanly get or create a table of contents such as the following:
|
||||||
|
|
||||||
|
```text
|
||||||
|
I. Introduction 1
|
||||||
|
1. Introduction 3
|
||||||
|
1.1. Software deployment 3
|
||||||
|
1.2. The state of the art 6
|
||||||
|
1.3. Motivation 13
|
||||||
|
1.4. The Nix deployment system 14
|
||||||
|
1.5. Contributions 14
|
||||||
|
1.6. Outline of this thesis 16
|
||||||
|
1.7. Notational conventions 17
|
||||||
|
```
|
||||||
|
|
||||||
|
Then the best bet is probably to use [HandyOutliner] to ingest that table of
|
||||||
|
contents as text and create bookmarks.
|
||||||
|
|
||||||
|
Often copy support in PDF tables of contents is pretty awful (and I can only
|
||||||
|
imagine it does horrors to screen readers), so it may need some serious amount
|
||||||
|
of cleanup in a text editor, as was the case for me while making an outline for
|
||||||
|
Eelco Dolstra's PhD thesis on Nix.
|
||||||
|
|
||||||
|
Another way this can be done is with the "Bookmarks" tab in [jPDF
|
||||||
|
Tweak][jpdf-tweak] and importing a CSV you make.
|
||||||
|
|
||||||
|
Such a CSV looks like so:
|
||||||
|
|
||||||
|
```
|
||||||
|
1;O;Acknowledgements;3
|
||||||
|
1;O;Contents;5
|
||||||
|
1;O;I. Introduction;9
|
||||||
|
2;O;1. Introduction;11
|
||||||
|
3;O;1.1. Software deployment;11
|
||||||
|
3;O;1.2. The state of the art;14
|
||||||
|
```
|
||||||
|
|
||||||
|
The columns are:
|
||||||
|
|
||||||
|
1. Depth
|
||||||
|
2. Open ("O" if the level in the tree should start opened, else "")
|
||||||
|
3. Data
|
||||||
|
4. Page number. You can also put coordinates at the end if truly motivated.
|
||||||
|
|
||||||
|
## Encrypted PDFs
|
||||||
|
|
||||||
|
These are annoying. You can strip the encryption with `qpdf`:
|
||||||
|
|
||||||
|
```text
|
||||||
|
qpdf --decrypt input.pdf output.pdf
|
||||||
|
```
|
||||||
|
|
||||||
|
## Pages are in the wrong order/PDFs need merging
|
||||||
|
|
||||||
|
Imagine that you have been fighting a scanner to scan some document and the
|
||||||
|
software for it is bad and doesn't show previews large enough to make out the
|
||||||
|
page numbers. Exasperated, you just save the PDF knowing the pages are in the
|
||||||
|
wrong order and spread over multiple files.
|
||||||
|
|
||||||
|
For this, use [pdfarranger], which makes it easy to reorder pages as desired.
|
||||||
|
|
||||||
|
[pdfarranger]: https://github.com/pdfarranger/pdfarranger
|
||||||
|
|
||||||
|
# Having too many PDFs in my life
|
||||||
|
|
||||||
|
## Directory full of PDFs to search
|
||||||
|
|
||||||
|
Relatable problem! Use [pdfgrep]:
|
||||||
|
|
||||||
|
```text
|
||||||
|
pdfgrep -nri 'somequery' .
|
||||||
|
```
|
||||||
|
|
||||||
|
[pdfgrep]: https://pdfgrep.org/
|
||||||
|
|
||||||
|
## Too many bloody PDFs; overflowing disorganized directories
|
||||||
|
|
||||||
|
Academics have this problem and equally have solutions: Use [Zotero] or similar
|
||||||
|
research document management software to categorize and tag documents.
|
||||||
|
|
||||||
|
[Zotero]: https://www.zotero.org/
|
||||||
|
|
||||||
|
## Getting more of them
|
||||||
|
|
||||||
|
As I have student credentials, I can use the University library to get
|
||||||
|
documents. However, getting authenticated to publisher sites is annoying: I
|
||||||
|
often don't use the University library's search system since it can have poor
|
||||||
|
results, but the login pages for individual publisher sites are confusing as
|
||||||
|
well.
|
||||||
|
|
||||||
|
UBC uses OpenAthens for their access control on publisher sites. They have a
|
||||||
|
rather nice uniform redirector service that can log in and redirect back to
|
||||||
|
sites:
|
||||||
|
<https://docs.openathens.net/libraries/redirector-link-generator>
|
||||||
|
|
||||||
|
I made a little bookmarklet to authenticate to publisher sites:
|
||||||
|
|
||||||
|
```javascript
|
||||||
|
javascript:void(location.href='https://go.openathens.net/redirector/ubc.ca?url='+location.href)
|
||||||
|
```
|
||||||
|
|
||||||
|
It's also possible to use a well-known Web site to "acquire" papers, which is
|
||||||
|
often more convenient than the silly barriers that publishers use to extract
|
||||||
|
profits from keeping publicly-funded knowledge unfree (paper authors are paid
|
||||||
|
*nil* by journals), even with legitimate access. If one were to use such a
|
||||||
|
hypothetical Web site, it is easiest to use by putting the DOI of papers into
|
||||||
|
it.
|
||||||
|
|
||||||
|
Also, paper authors probably have copies of their papers, and are typically
|
||||||
|
happy to send them to you for free if you email them.
|
||||||
|
|
||||||
245
content/posts/speedy-ifd.md
Normal file
245
content/posts/speedy-ifd.md
Normal file
|
|
@ -0,0 +1,245 @@
|
||||||
|
+++
|
||||||
|
date = "2022-10-18"
|
||||||
|
draft = true
|
||||||
|
path = "/blog/speedy-ifd"
|
||||||
|
tags = ["haskell", "nix"]
|
||||||
|
title = "Speedy import-from-derivation in Nix?"
|
||||||
|
+++
|
||||||
|
|
||||||
|
Nix has a feature called "import from derivation", which is sometimes called
|
||||||
|
"such a nice foot gun" (grahamc, 2022). I can't argue with its
|
||||||
|
usefulness; it lets Nix do amazing things that can't be accomplished any other
|
||||||
|
way, and avoid pointlessly checking build products into git. However, it has a
|
||||||
|
dirty secret: it can *atrociously* slow down your builds.
|
||||||
|
|
||||||
|
The essence of this feature is that Nix can do operations such as build
|
||||||
|
derivations whose result is used in the *evaluation stage*.
|
||||||
|
|
||||||
|
## Nix build staging?
|
||||||
|
|
||||||
|
Nix, in its current implementation (there are efforts [such as tvix][tvix] to
|
||||||
|
change this), can do one of two things at a given time.
|
||||||
|
|
||||||
|
* Evaluate: run Nix expressions to create some derivations to build. This stage
|
||||||
|
outputs `.drv` files, which can then be instantiated. Nix evaluation happens
|
||||||
|
in serial (single threaded), and in a lazy fashion.
|
||||||
|
* Build: given some `.drv` files, fetch the result from a binary cache or build
|
||||||
|
from scratch.
|
||||||
|
|
||||||
|
[tvix]: https://code.tvl.fyi/about/tvix
|
||||||
|
|
||||||
|
### How does import-from-derivation fit in?
|
||||||
|
|
||||||
|
Import-from-derivation (IFD for short) lets you do magical things: since Nix
|
||||||
|
derivations can do arbitrary computation in any language, Nix expressions or
|
||||||
|
other data can be generated by external programs that need to do pesky things
|
||||||
|
such as parse cursed file formats such as cabal files.
|
||||||
|
|
||||||
|
N.B. I've heard that someone wrote a PureScript compiler targeting the Nix
|
||||||
|
language, which was then targeted at [parsing a Cabal file to do cabal2nix's job][evil-cabal2nix]
|
||||||
|
entirely within Nix. Nothing is sacred.
|
||||||
|
|
||||||
|
[evil-cabal2nix]: https://github.com/cdepillabout/cabal2nixWithoutIFD
|
||||||
|
|
||||||
|
In order to achieve this, however, the evaluation stage can demand builds be
|
||||||
|
run. In fact, such builds need to be run before proceeding with evaluation! So
|
||||||
|
IFD serializes builds.
|
||||||
|
|
||||||
|
### What constitutes IFD?
|
||||||
|
|
||||||
|
The following is a nonexhaustive list of things constituting IFD:
|
||||||
|
* `builtins.readFile someDerivation`
|
||||||
|
* `import someDerivation`
|
||||||
|
* *Any use* of builtin fetchers:
|
||||||
|
* `builtins.fetchGit`
|
||||||
|
* `builtins.fetchTree`
|
||||||
|
* `builtins.fetchTarball`
|
||||||
|
* `builtins.fetchurl`
|
||||||
|
* etc
|
||||||
|
|
||||||
|
#### Builtin fetchers
|
||||||
|
|
||||||
|
Use of builtin fetchers is a surprisingly common IFD problem. Sometimes it is
|
||||||
|
done by mistake, but other times it is done for good reason, with unfortunate
|
||||||
|
tradeoffs. I think it's reasonable to use IFD to import libraries such as
|
||||||
|
nixpkgs, since fundamentally the thing needs to be fetched for evaluation to
|
||||||
|
proceed, but other cases are more dubious.
|
||||||
|
|
||||||
|
One reason one might use the builtin fetchers is that there is no way
|
||||||
|
(excepting calculated use of impure builders) to use a derivation to download a
|
||||||
|
URL without knowing the hash ahead of time.
|
||||||
|
|
||||||
|
An example I've seen of this being done on purpose is wanting to avoid
|
||||||
|
requiring contributors to have Nix installed to update the hash of some
|
||||||
|
tarball, since Nix has its own bespoke algorithm for hashing tarball contents
|
||||||
|
that nobody has yet implemented outside Nix. So the maintainers used an impure
|
||||||
|
network fetch (only feasible with a builtin) and instituted a curse on the
|
||||||
|
build times of Nix users.
|
||||||
|
|
||||||
|
The reason that impure fetching needs to be a builtin is because Nix has an
|
||||||
|
important purity rule for derivations: either input is fixed and network access
|
||||||
|
is disallowed, or output is fixed and network access is allowed. In the Nix
|
||||||
|
model as designed ([content-addressed store] aside), derivations are identified
|
||||||
|
only by what goes into them, but not the output.
|
||||||
|
|
||||||
|
Let's see why that is. Assume that network access is allowed in normal
|
||||||
|
builders. If the URL but no hash goes in *and* network access is available,
|
||||||
|
anything can come out without changing the store path. Such an impurity would
|
||||||
|
completely break the fantastic property that Nix has no such thing as a clean
|
||||||
|
build since the builds don't get dirtied to begin with.
|
||||||
|
|
||||||
|
Thus, if one is doing an impure network fetch, the resulting store path has to
|
||||||
|
depend on the content without knowing the hash ahead of time. Therefore the
|
||||||
|
fetch *has* to serialize all evaluation after it, since it affects the store
|
||||||
|
paths of anything downstream of it during evaluation.
|
||||||
|
|
||||||
|
That said, it is, in my opinion, a significant design flaw in the Nix evaluator
|
||||||
|
that it cannot queue all the derivations that are reachable, rather than
|
||||||
|
stopping and building each in order.
|
||||||
|
|
||||||
|
[content-addressed store]: https://github.com/NixOS/rfcs/blob/master/rfcs/0062-content-addressed-paths.md
|
||||||
|
|
||||||
|
## Stories
|
||||||
|
|
||||||
|
I work at a Haskell shop which makes extensive use of Nix. We had a bug where
|
||||||
|
Nix would go and serially build "`all-cabal-hashes-component-*`". For several
|
||||||
|
minutes.
|
||||||
|
|
||||||
|
This was what I would call a "very frustrating and expensive developer UX bug".
|
||||||
|
I fixed it in a couple of afternoons by refactoring the use of
|
||||||
|
import-from-derivation to result in fewer switches between building and
|
||||||
|
evaluating, which I will expand on in a bit.
|
||||||
|
|
||||||
|
## Background on nixpkgs Haskell
|
||||||
|
|
||||||
|
The way that the nixpkgs Haskell infrastructure works is that it has a
|
||||||
|
[Stackage]-based package set based on some Stackage Long-Term Support release,
|
||||||
|
comprising package versions that are all known to work together. The set is
|
||||||
|
generated via a program called [`hackage2nix`][hackage2nix], which runs
|
||||||
|
`cabal2nix` against the entirety of Hackage.
|
||||||
|
|
||||||
|
`cabal2nix` is a program that generates metadata, input hashes, and hooks up
|
||||||
|
dependencies declared in the `.cabal` files to Nix build inputs.
|
||||||
|
|
||||||
|
This set can then be overridden by [overlays] which can apply patches, override
|
||||||
|
sources, introduce new packages, and do basically any other arbitrary
|
||||||
|
modification.
|
||||||
|
|
||||||
|
At build time, the builder will be provisioned with a GHC package database with
|
||||||
|
everything in the build inputs of the package, and it will build and test the
|
||||||
|
package.
|
||||||
|
|
||||||
|
In this way, each dependency is turned into a Nix derivation so caching
|
||||||
|
of dependencies for development shells, parallelism across package builds, and
|
||||||
|
other useful properties simply fall out for free.
|
||||||
|
|
||||||
|
[Stackage]: https://www.stackage.org/
|
||||||
|
[hackage2nix]: https://github.com/NixOS/cabal2nix/tree/master/cabal2nix/hackage2nix
|
||||||
|
[overlays]: https://nixos.org/manual/nixpkgs/stable/#chap-overlays
|
||||||
|
|
||||||
|
## Where's the IFD?
|
||||||
|
|
||||||
|
nixpkgs Haskell provides a wonderfully useful function called `callCabal2nix`,
|
||||||
|
which executes `cabal2nix` to generate the Nix expression for some Haskell
|
||||||
|
source code at Nix evaluation time. Uh oh.
|
||||||
|
|
||||||
|
It also provides another wonderfully useful function called `callHackage`. This
|
||||||
|
is a very sweet function: it will grab a package of the specified version off
|
||||||
|
of Hackage, and call `cabal2nix` on it.
|
||||||
|
|
||||||
|
Wait, how does that work, since you can't just download stuff for fun without
|
||||||
|
knowing its hash? Well, there's your problem.
|
||||||
|
|
||||||
|
"Figuring out hashes of stuff on Hackage" was solved by someone publishing a
|
||||||
|
comically large GitHub repo called `all-cabal-hashes` with hashes of all of the
|
||||||
|
tarballs on Hackage and CI to keep it up to date. Using this repo, you only
|
||||||
|
have to deal with keeping one hash up to date: the hash of the version of
|
||||||
|
`all-cabal-hashes` you're using, and the rest are just fetched from there.
|
||||||
|
|
||||||
|
### Oh no
|
||||||
|
|
||||||
|
This repository has an obscene number of files in it, such that it takes dozens
|
||||||
|
of seconds to unpack it. So it's simply not unpacked. Fetching a file out of it
|
||||||
|
involves invoking tar to selectively extract the relevant file from the giant
|
||||||
|
tarball of this repo.
|
||||||
|
|
||||||
|
That, in turn, takes around 7 seconds on the fastest MacBook available, for
|
||||||
|
each and every package, in serial. Also, Nix checks the binary caches for each
|
||||||
|
and every one, further compounding the fail.
|
||||||
|
|
||||||
|
I optimized it to take about 7 seconds, *total*. Although I *am* a witch, I
|
||||||
|
think that there is some generally applicable intuition derived from this that
|
||||||
|
can be used to make IFD go fast.
|
||||||
|
|
||||||
|
# Making IFD go fast
|
||||||
|
|
||||||
|
Nix is great at building big graphs of dependencies in parallel and caching
|
||||||
|
them. So what if we ask Nix to do that?
|
||||||
|
|
||||||
|
How can this be achieved?
|
||||||
|
|
||||||
|
What if you only demand one big derivation be built with IFD then reuse it
|
||||||
|
across all the usage sites?
|
||||||
|
|
||||||
|
## Details of `some-cabal2nix`
|
||||||
|
|
||||||
|
My observation was that hot-cache builds with a bunch of IFD are fine; it's
|
||||||
|
refilling it that's horribly painful since Nix spends a lot of time doing
|
||||||
|
pointless things in serial. What if we warmed up the cache by asking it to
|
||||||
|
build all that stuff in one shot? Then, the rest of the IFD would hit a hot
|
||||||
|
cache.
|
||||||
|
|
||||||
|
*The* major innovation in the fix, which I called `some-cabal-hashes`, is that
|
||||||
|
it builds *one* derivation with IFD that contains everything that will be
|
||||||
|
needed for further evaluation, then all the following imports will hit that
|
||||||
|
already-built derivation.
|
||||||
|
|
||||||
|
Specifically, my build dependency graph now looks like:
|
||||||
|
|
||||||
|
```
|
||||||
|
/- cabal2nix-pkg1 -\
|
||||||
|
some-cabal2nix -+- cabal2nix-pkg2 -+- some-cabal-hashes -> all-cabal-hashes
|
||||||
|
\- cabal2nix-pkg3 -/
|
||||||
|
```
|
||||||
|
|
||||||
|
There are two notable things about this graph:
|
||||||
|
|
||||||
|
1. It is (approximately) the natural graph of the dependencies of the build
|
||||||
|
assuming that the Nix evaluator could keep going when it encounters IFD.
|
||||||
|
|
||||||
|
2. It allows Nix to naturally parallelize all the `cabal2nix-*` derivations.
|
||||||
|
|
||||||
|
Then, all of the usage sites are `import "${some-cabal2nix}/pkg1` or similar.
|
||||||
|
In this way, one derivation is built, letting Nix do what it's good at. I did
|
||||||
|
something clever also: I made `some-cabal2nix` have no runtime dependencies by
|
||||||
|
*copying* all the resulting cabal files into the output directory. Thus, the
|
||||||
|
whole thing can be fetched from a cache server and not built at all.
|
||||||
|
|
||||||
|
Acquiring the data to know what will be demanded by any IFD is the other piece
|
||||||
|
of the puzzle, of course. I extracted the data from the overlays by calling the
|
||||||
|
overlays with stubs first (to avoid a cyclic dependency), then evaluating for
|
||||||
|
real with a `callHackage` function using the `some-cabal2nix` created using
|
||||||
|
that information.
|
||||||
|
|
||||||
|
The last and very important optimization I did was to fix the `tar` invocation.
|
||||||
|
`tar` files have a linear structure that is perfect for making `t`ape
|
||||||
|
`ar`chives (hence the name of the tool) which can be streamed to a tape: one
|
||||||
|
file after the other, without any index. Thus, finding a file in the tarball
|
||||||
|
takes an amount of time proportional to `O(n)` where `n` is the number of files
|
||||||
|
in the archive.
|
||||||
|
|
||||||
|
If you call tar `m` times for the number of files you need, then you do
|
||||||
|
`O(n*m)` work. However, if you call `tar` *once* with a set of files that you
|
||||||
|
want such that it can do one parse through the file and an `O(1)` membership
|
||||||
|
check of that set, then the overall time complexity is `O(n)`. I can assume
|
||||||
|
that is what `tar` actually does, since it does the entire extraction in
|
||||||
|
basically the same time with a long file list as with one file.
|
||||||
|
|
||||||
|
Enough making myself sound like I am in an ivory tower with big O notation;
|
||||||
|
regardless, extracting with a file list yielded a major performance win.
|
||||||
|
|
||||||
|
I also found that if you use the `--wildcards` option, `tar` is extremely slow
|
||||||
|
and it seems worse with more files to extract. Use exact file paths instead.
|
||||||
|
|
||||||
|
FIXME: get permission to release the sources of that file
|
||||||
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue