Guarantee Consistent Builds and Obsolete overrideScope

The Problem

A lot of effort goes into curated package sets like Stackage, but even so we can compile only ~50% of the packages available from Hackage. It appears to be the nature of the game: when lens 2.x comes out with a fundamentally new API, then some packages will adopt the new version and others won’t. A consistent package set must chose on which side of the fence it wants to live. Either those packages that depend on lens 1.x compile or those that depend on 2.x compile — but not both.

Now, Nix is not limited to one particular version of lens — we can have both versions available at the same time. But it’s difficult to take advantage of that feature, because once you start mixing lens 1.x and 2.x in the same package set, you risk inconsistent builds, i.e. builds where some part of the dependency tree refers to lens 1.x and another part refers to lens 2.x. It’s a bad idea to try and link those two trees together into one executable; we are fortunate that Cabal detects this error during the configure phase and aborts the build!

We need a mechanism that can mix multiple package versions within a package set, but that also guarantees consistency for every single build. We hoped overrideScope would be that mechanism, but somehow it hasn’t quite lived up to the promise, mostly because it is hard to understand.

The Situation Today

Consider an executable package foobar that depends on the libraries foo and bar, each of which depends on lens. The corresponding definitions in hackage-packages.nix — stripped down to the relevant bits — look as follows:

"lens"     = ... lens version 1.x ...;
"lens_2_0" = ... lens version 2.x ...;

"foo" = callPackage
    ({ mkDerivation, lens }:
    mkDerivation {
      pname = "foo";
      libraryHaskellDepends = [lens];
    }) {};

"bar" = callPackage
    ({ mkDerivation, lens }:
    mkDerivation {
      pname = "bar";
      libraryHaskellDepends = [lens];
    }) {};

"foobar" = callPackage
    ({ mkDerivation, lens }:
    mkDerivation {
      pname = "foobar"; [...]
      libraryHaskellDepends = [foo bar];
    }) {};

Let’s assume that foo won’t compile in that setup because it requires lens version 2.x. We can remedy that by adding an override to configuration-common.nix that says:

foo = super.foo.override { lens = self.lens_2_0; };

That change fixes the build of foo, but foobar remains broken, because now it pulls in both lens 1.x and 2.x simultaneously through its dependencies. If bar works only with lens 1.x, then there is nothing we can do: the version constraints conflict and we cannot compile foobar. If bar does support lens 2.x, however, then we can just switch it to the newer version with:

bar = super.bar.override { lens = self.lens_2_0; };

Now we can compile foobar! Unfortunately, that change may break other builds. There is a reason why lens 1.x is our default choice. If any other package depends on bar as well as lens 1.x (directly or indirectly), then it will no longer compile after that change.

We can avoid that side-effect by localizing the override to foobar:

foobar = super.foobar.override {
  bar = self.bar.override { lens = self.lens_2_0; };
};

That approach allows us to compile foobar, while still leaving the default version of bar at lens 1.x, like most of our packages require. Overriding build inputs this way works fine, and we have used this technique for many years to fix builds that require non-default versions to compile. The downside of these nested overrides is that the tend to become freaky complicated if a package needs overriding that is sufficiently deep in the dependency tree. The GHC 7.8.4 package set, for example, needed many such overrides because its default version of mtl was stuck at version 2.1.x all the while large parts of Hackage had moved on to mtl 2.2.x. Since mtl is a rather fundamental package, we had nested overrides 3-4 levels deep that were highly repetitious, too. It was a mess.

Haskell NG improved on that situation by adding overrideScope. That function changes the package set (”scope”) in which Nix evaluates a build expression. The override

foobar = super.foobar.overrideScope (self: super: { lens = self.lens_2_0; });

creates a new temporary package set, replaces lens with lens_2_0 in it, and then evaluates foobar. The callPackage function picks up the re-written lens attribute, which means that there’s no need to override that choice explicitly in all dependencies of foobar. One could say that overrideScope implements “deep overriding”, i.e. it applies an override to the given derivation as well as all sub-derivations that it refers to.

Unfortunately, we lack a proper understanding of how expensive that technique is memory and performance-wise. In the past, we’ve occasionally crashed Nix with this kind of stuff — keep in mind that the interpreter creates a whole new package set for every build that uses this mechanism —, but when used sparingly, overrideScope seems to work okay.

In some cases, overrideScope won’t work at all, i.e. when confronted with builds that have explicitly passed arguments. For example, let’s say that lens 3.x comes. So we try to compile foobar like this:

foobar = super.foobar.overrideScope (self: super: { lens = self.lens_3_0; });

That build will fail, because we added an explicit override for foo earlier that committed the build to lens_2_0, and overrideScope will not affect that choice since that build input is not picked up with callPackage. So foobar will pull in both lens 2.x and 3.x despite the use of overrideScope.

Possible Improvements

We generate Haskell build expressions automatically with cabal2nix, and that tool knows the complete dependency tree for every package. So it would be possible to generate builds that expect as function arguments not just their immediate dependencies but the transitive closure of all dependencies. Build expressions would then call their direct dependencies, passing in appropriate versions of their respective dependencies, etc. For example:

"foobar" = callPackage
    ({ mkDerivation, foo, bar, many, other, inputs, of, lens }:
    let lens' = lens.override { inherit many other inputs of lens; };
        foo'  = foo.override { lens = lens'; };
        bar'  = bar.override { lens = lens'; };
    in
    mkDerivation {
      pname = "foobar"; [...]
      libraryHaskellDepends = [foo' bar'];
    }) {};

Now, foobar expects every single package that occurs anywhere inside of its dependency tree as an argument, and it constructs the dependency tree using those arguments. So the build must be consistent. It’s impossible for foobar to refer to two incompatible versions of lens, because its inputs always use the same version.

Consequently,

foobar.override { mtl = self.mtl_2.4.0; }

gives us is a version of foobar that has its entire dependency tree built with mtl 2.4.x. We could even get rid of the override altogether if we adopt the suggestions from “Use Function Application To Escape Override Hell” and remove callPackage from hackage-packages.nix. We’d define all builds as straight functions

"foobar" =
    { mkDerivation, foo, bar, many, other, inputs, of, lens }:
    let lens' = lens { inherit many other inputs of lens; };
        foo'  = foo { lens = lens'; };
        bar'  = bar { lens = lens'; };
    in
    mkDerivation {
      pname = "foobar"; [...]
      libraryHaskellDepends = [foo' bar'];
    };

and invoke them from inside of a package set with:

callPackage foobar { mtl = self.mtl_2.4.0; }

This would give us a guarantee for consistent builds without any overrides.