A faster dockerTools.buildImage prototype

dockerTools.buildImage is the nixpkgs function to create OCI images. From a Nix expression, it creates a OCI image archive (which is basically a tar of layers, where each layer is a tar’ed file tree). Once this OCI image archive has been written to the Nix store, it can then be loaded it in the Docker deamon or pushed it to a Docker registry. Writing container images with the dockerTools.buildImage function is pretty convenient, but it has still several performance issues:

  1. to use a container image built by Nix, we first need to write the whole image (with all its layers) in the Nix store;

  2. layers in the OCI archive image contain data already present in the Nix store (this consumes Nix store disk space);

  3. the dockerTools.buildImage build result is a tar containing all layers: even it the image is composed by several layers, a change in a layer leads to a full new OCI archive in the Nix store. Writing a new OCI archive takes time and consume Nix store disk space.

And another dockerTools.buildImage drawback is it’s implementation: more than 500 lines of a unmaintanable Bash magic :/

In this blog post, we discuss (through a prototype) how all of these points could be improved.

Stop creating layer tarballs

Instead of building layers as tarballs, the idea is to build an artifact which describes a container layer with a list of Nix store paths. So, Nix would build a JSON file referencing store paths (with some metadatas). At runtime, instead of pushing the layers to a container registry, we would have to create a tar based on this JSON file and push this tar stream to the registry.

So, let’s see how this container image JSON file could looks like. Suppose we want to build a image containing an application printing a message:

let
  application = pkgs.writeScript "conversation" ''
    ${pkgs.hello}/bin/hello 
  '';
in  
  containerTools.buildImage {
    config = {
      entrypoint = ["${pkgs.bash}/bin/bash" application];
    };
  };

The function containerTools.buildImage is quite similar to the dockerTools.buildImage function, excepting it produces a different output. Building the above Nix expression results in the following file:

{
	"config": {
		"config": {
			"Entrypoint": [
				"/nix/store/a54wrar1jym1d8yvlijq0l2gghmy8szz-bash-5.1-p12/bin/bash",
				"/nix/store/2m0b10fkgscp4q4i7w1kl1f5pnc25xlk-conversation"
			]
		},
	},
	"layers": [
		{
			"digest": "sha256:1797c2e4cf41923f5741ee0c74575720d3232ae5f443ee6646a460e48d5703ad",
			"paths": [
				"/nix/store/2m0b10fkgscp4q4i7w1kl1f5pnc25xlk-conversation",
				"/nix/store/563528481rvhc5kxwipjmg6rqrl95mdx-glibc-2.33-56",
				"/nix/store/a54wrar1jym1d8yvlijq0l2gghmy8szz-bash-5.1-p12",
				"/nix/store/xcp9cav49dmsjbwdjlmkjxj10gkpx553-hello-2.10"
			]
		}
	]
}

Now, we need to load or push a container from this file.

Skopeo is the swissknife for container image manipulations. It allows to copy a image from a format to another one. For instance, Skopeo can copy an OCI tarball image to a container registry.

We could teach Skopeo to manipulate an image described by our JSON file. Skopeo would then be able to create layers from Nix store paths and send them on the fly to its supported destinations (such as a container registry). Fortunately, Skopeo can be easily expanded and a prototype can be implemented in about 250 lines of Go code.

From the JSON file, Skopeo could extract the image configuration and build the layer sha256:1797c... by “just” taring these 4 store paths.

Note the JSON file contains the layer digest. At build time, containerTools.buildImage computed the digest of the layer by generating the digest of tar (it doesn’t store the tar, only its digest). This allows Skopeo to only copy non existing layers, without having to compute the layer digests at runtime.

The container image decribed by the JSON file can then be pushed to a registry by our patched Skopeo:

skopeo copy nix:./result  docker://localhost:5000/conversation:latest
Getting image source signatures
Copying blob 1797c2e4cf41 done
Copying config 4e873f8b32 done
Writing manifest to image destination
Storing signatures

We pushed an image to a registry without having to create tarballs in the Nix store. This improves container build speed because we only write a small JSON file to the Nix store. This also reduces the Nix store disk space usage. This adresses points 1. and 2. of our drawback list.

Only rebuild new layers

Actually, the containerTools.buildImage function builds two artifacts: the image JSON file and a layer JSON file. The layer JSON file has been built thanks to the containerTools.buildLayer function. This function takes a list of store paths, generate the digest of the tar of these store paths and build a layer JSON file that looks like:

[
 {
  "digest": "sha256:1797c2e4cf41923f5741ee0c74575720d3232ae5f443ee6646a460e48d5703ad",
  "paths": [
   "/nix/store/2m0b10fkgscp4q4i7w1kl1f5pnc25xlk-conversation",
   "/nix/store/563528481rvhc5kxwipjmg6rqrl95mdx-glibc-2.33-56",
   "/nix/store/a54wrar1jym1d8yvlijq0l2gghmy8szz-bash-5.1-p12",
   "/nix/store/xcp9cav49dmsjbwdjlmkjxj10gkpx553-hello-2.10"
  ]
 }
]

Since the layer is built by its own derivation, we can update the image configuration without having to rebuild this layer. This is not possible with dockerTools.buildImage which rebuilds a new OCI image archive when the configuration is updated.

Isolate application dependencies in dedicated layers

Suppose we are working on our application codebase. We generally update the code of our application but more rarely update its dependencies. So, it would be convenient to be able to isolate application dependencies in there own layer, and the application codebase in another layer. This would allow us to only rebuild and push the application codebase layer, instead of all layers.

So, we explicitly specify the dependencies (bash and hello) of our application in the dependencyLayers parameter of containerTools.buildImage function.

containerTools.buildImage {
  config = {
    entrypoint = ["${pkgs.bash}/bin/bash" application];
  };
  dependencyLayers = [
    (containerToos.buildLayer { 
      contents = [pkgs.bash pkgs.hello]; 
     })
  ];
};

The JSON file now looks like:

{
 "config": {
  "config": {
  	"Entrypoint": [
  		"/nix/store/a54wrar1jym1d8yvlijq0l2gghmy8szz-bash-5.1-p12/bin/bash",
  		"/nix/store/2m0b10fkgscp4q4i7w1kl1f5pnc25xlk-conversation"
  	]
  },
  "rootfs": {
   "diff_ids": [
     "sha256:590866221f2617ddba00afb85908f4a5e6b822123e6b990a8abb68848ef1e8c7",
     "sha256:3391954db2f3c384d4cae1b1da9fe7e488b07bdb1a536187e955e2005b8d5c5c"
   ]
  }
 },
 "layers": [
  {
   "digest": "sha256:590866221f2617ddba00afb85908f4a5e6b822123e6b990a8abb68848ef1e8c7",
   "paths": [
    "/nix/store/2m0b10fkgscp4q4i7w1kl1f5pnc25xlk-conversation"
   ]
  },
  {
   "digest": "sha256:3391954db2f3c384d4cae1b1da9fe7e488b07bdb1a536187e955e2005b8d5c5c",
   "paths": [
    "/nix/store/563528481rvhc5kxwipjmg6rqrl95mdx-glibc-2.33-56",
    "/nix/store/a54wrar1jym1d8yvlijq0l2gghmy8szz-bash-5.1-p12",
    "/nix/store/xcp9cav49dmsjbwdjlmkjxj10gkpx553-hello-2.10"
   ]
  }
 ]
}

The resulting image contains two layers: one with the closure of bash and hello and the another one with our application script. When we build the application layer, we only add store paths that are not already present in a dependency layer: when we update our application code base, we only need to rebuild the application layer derivation.

We can now work on our application codebase without having to rebuild the whole image. Nix only rebuilds the derivation of updated layers and Skopeo only pushes new layers.

The parameter dependencyLayers addresses the point 3..

Some notes about reproducibility

This method only works if store paths used in layers are bit reproducible. The image JSON file contains the digest of store paths used by the layer. It is possible to build some store paths while getting the JSON image file from a Nix binray cache. In this situation, it would be possible to get different digests. When Skopeo pushes the image, it will fail because the digest it computed (from the local store paths) does’t correspond to the digest specified in the image JSON file (computed on another machine).

To address this issue, we could add a nonReproducible option in the containerTools.buildLayer function. Instead of only storing the digest, we would also store the tar. Note in practice, an important part of nixpkgs is bit reproducible and this would rarely be needed.

Current implementation

containerTools.buildImage and containerTools.buildLayer uses a Go binary to create JSON files. A patch is required in Skopeo to add the support of the Nix container image JSON file.

The material used in this post can be found in the following branches:

To build an image with the containerTools.buildImage function, checkout the the nixpkgs container-tools branch:

nix-build -A containerTools.example.image -o image.json
$(nix-build -A skopeo)/bin/skopeo copy nix:./image.json docker-daemon:conversation:latest

Conclusion

We have seen how containerTools.buildImage could make container image builds much more efficient. Instead of having to rebuild a full image on each change, we just need to build a layer, which only contains our application source code.