Container Image Internals, Part 1: docker pull | Dan Lorenc's Blog

Container Image Internals, Part 1: docker pull

The rise in popularity of Docker has led to a proliferation of container image usage in the cloud and devops space. The Docker toolchain makes working with container images easy, allowing users to build, distribute and run these images in just a handful of user-friendly commands.

This series attempts to shed some light on the image format used by Docker (and the new Open Containers Initiative!) images. I will explain some basic examples of how images are constructed, and I’ll work through some basic examples of building images from scratch without using the Docker toolchain.

Prereqs

I tested the following commands on MacOS Sierra. If you plan on following along, you’ll need:

  • A text editor you’re comfortable with.
  • The amazing ‘jq’ tool for formatting and manipulating json data, which can be installed with Homebrew.

I used the Google Container Registry for this example, so if you want to push or work with private images in later parts of this series you’ll also need to install the Google Cloud SDK to help with authentication.

Part 1: Manifests and Blobs

Have you ever pulled a Docker container using the ‘docker pull’ command? After reading this post, you should understand exactly how this command works. In the first part of this series, we’re going to use just bash and some standard command line tooling to pull an image from a container registry to our laptop, where we can unpack it and inspect the contents.

What’s in a Container Image?

A container image combines two main concepts - a root filesystem and configuration.

A root filesystem is a description of exactly what files should be present inside a running container, and in what places. This is the part of the image that describes what you can see after running a command like: docker run -it $container bash and poking around with cd and ls.

Docker and the new Open Containers Initiative provide image specifications that use a concept of filesystem layers to build and store the root filesystem. Layers primarily serve to cut up a single large root filesystem into a set of smaller chunks that can be shared across images. If you’re working with a lot of images, it’s important to make sure you share as many layers as possible so pulls and pushes are fast, and so your images take up less disk space. We’ll talk more about these layers later.

The configuration section of an image is everything else needed to run a container image. This includes things like environment variables, information about the target architecture, and metadata about how the image was built for viewing later.

The Manifest

Docker and the OCI specification use a container manifest to describe the root filesystem and configuration of an image. This manifest is the canonical definition of our image: uploading a manifest to the registry creates an image, and deleting the manifest deletes the image.

If the word manifest sounds complicated, don’t worry! It’s just a fancier word for JSON file. Here’s what a simple one for an image with one layer might look like:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 190,
    "digest": "sha256:efe184abb97e76d7d900b2e97171cc20830b6b1b0e0fe504a4ee7097a6b5c91b"
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 170,
      "digest": "sha256:9964c16915b8956cb01eb77028b1fd1976287b5ec87cc1663844a0bd32933a47"
    }
  ]
}

The schemaVersion and mediaType fields are just boilerplate explaining that this JSON file happens to represent a Docker image manifest. The config field points to a Docker runtime configuration file that contains instructions for how to run the image. We’ll explain this field in later parts of this series.

The layers field contains a list of the layers used to build our root filesystem. This is the field we’re going to spend most of our time with in the rest of this article.

Want to see the manifest for an image you’ve built and pushed to a registry? You can use curl and the registry API to download and view this manifest. Here’s how to get the manifest for the official public Debian 8 image hosted on the Google Container Registry at l.gcr.io/google/debian8:latest:

curl -L \
  -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
  l.gcr.io/v2/google/debian8/manifests/latest | jq .

The -L flag tells curl to follow HTTP redirects. Many Docker registries use redirects to Content Deliver Networks or storage systems like Amazon S3 or Google Cloud Storage for increased performance and reliability, so we’ll need it on most of our curl commands.

It’s also important not to forget the -H flag. This tells curl to pass a header to the registry indicating which schema version we would like to receive the manifest in. For this series we’ll be using the schema version 2, which is much simpler than schema version 1.

The l.gcr.io portion of our image name becomes the hostname of our request. The rest of the path describes the name of the image (google/debian8). Then we’re describing what we’re looking for (manifests), and which one to get (latest).

This is a public image, so you don’t need to pass any authentication information. If you want to try this on one of your own images, you’ll need to do a little bit more work, which we’ll explain once we get to building and pushing images later.

Blobs and Layers

The next important concept of the Registry API is called blobs. As you can see above, the manifest does not contain the actual layer objects, it contains references to them:

"layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 170,
      "digest": "sha256:9964c16915b8956cb01eb77028b1fd1976287b5ec87cc1663844a0bd32933a47"
    }

These are stored as ‘blobs’ by the registry, and can also be fetched with curl.

Like manifest, blob is an opaque term.

The mediaType field in the layers list of the manifest gives us a hint about what these blobs are. In this case, the layers are just gzipped tarballs:

"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip"

To pull an image and assemble it on our local machine, we first have to get all these blobs. To make finding, caching and storing blobs easier, Docker uses a concept called ‘content addressability’. This means that blobs are named according to their content, in this case a sha256 hash. This means two identical blobs will always have the same name, so you and the registry can cache them more easily.

Layers are ordered with the the base layer at the beginning of the list in the manifest. To pull that layer, you can use this curl command to download the blob, and use content addressability to store it in a file named according to its digest.

# First let's make a directory to store our image in.
cd $HOME
mkdir google-debian8
cd google-debian8

# Layers are stored from the bottom up. Now download the first layer:
blob=sha256:4e790dc65cf5e5fbc09c7c1ac674c93fc0c381569f4e7ed05bd33841bf2b072d
curl -L  l.gcr.io/v2/google/debian8/blobs/$blob > $blob.tar.gz

Let’s see what’s inside this layer:

tar -tf $blob.tar.gz | less

This outputs a list of all the filenames in our base layer. As you can see, there are quite a few files in here since this layer represents a base debian distribution. To build a base layer like this yourself you can use Debian’s debootstrap.

To grab all the layers for an entire image we just need to put that last command into a for loop, iterating over the layers field of our manifest:

# Switch these to pull a different image or tag
image=google/debian8
tag=latest

# Download the manifest again, and store it in a variable
manifest=$(curl -L \
           -H 'Accept: application/vnd.docker.distribution.manifest.v2+json' \
           l.gcr.io/v2/$image/manifests/$tag)

# Use jq to get a list of the layers we can iterate over
layers=$(echo $manifest | jq -r .layers[].digest
for layer in $layers; do
  echo "Pulling layer: $layer"
  # Use content addressability to avoid pulling layers we already have!
  if [[ -e "${layer}.tar.gz" ]]; then
    echo "We already have layer $layer. Skipping download."
  else
    echo "Downloading $layer"
    curl -L  l.gcr.io/v2/google/debian8/blobs/$layer > $layer.tar.gz
  fi
done

This script also adds a check to avoid re-downloading layers that we already have. Content-addressability! To test this out, run the same script again and see that we don’t download anything.

If you’re not on a Linux machine you won’t be able to actually run any of the binaries in this image directly, but you can still unpack it and look around. To assemble an image from layers, it’s just a matter of unpacking the gzipped tarballs in the correct order:

# Start by making a directory to unpack into
mkdir unpacked

# Unpack in a loop
for layer in $layers; do
  tar -zxf "${layer}.tar.gz" -C unpacked/
done

Now you can see your image contents:

$ ls unpacked
bin     boot    dev     etc     home    lib     lib64   media   mnt     opt     proc    root    run     sbin    srv     sys     tmp     usr     var

Summary

Congratulations, you’ve built a content-addressable store and used it to pull and unpack a Docker image from a registry in just a few lines of bash! Hopefully the image format is a lot less mysterious now.

It’s important to note that we did leave out a few important image features, mainly ‘whiteout files’ and fetching the configuration file. A production ‘docker pull’ implementation should also handle downloading these blobs in parallel, resumable downloads and authentication for private registries.

Later in this series we’ll build and push an image from scratch, again with just bash and jq. We’ll also learn some useful tricks for working with images inside a container registry. Finally, we’ll learn how docker uses some clever filesystem features to avoid needing to store a full filesystem for every running container.

If you’d like to learn more about these and the registry API in general, see the official specification here: https://docs.docker.com/registry/spec/api/

2017 | About