SOCI, buildkit and containerd

This article is part of the Building streamable containers in EKS series.

Background

Managing container images and CI/CD pipelines is par for the course when you’re working with Kubernetes infrastructure. But when machine learning enters the mix, you not only have to deal with long build times for massive container images, but also long pull times when scaling out vertically.
In this post, I’ll share some strategies for tackling both build and pull times using Docker Build and Seekable OCI.

Heres a quick overview of key technologies we’ll use:

Docker build

Docker build is a client-server architecture with Buildx Docker CLI plugin being the client and user interface and BuildKit - the server or builder that will actually be handling build execution.

Both Buildx and BuildKit are installed with Docker Desktop and Docker Engine out-of-the-box.

BuildKit workers

BuildKit daemons (workers) come in 2 flavors: containerd and oci.

  • containerd workers rely on containerd runtimer to manage containers and images. Containerd needs to be up-and-running on the host.
  • oci workers manages containers and images themselves, and containerd is not needed.

Buildx drivers

Buildx additionally implements several “build drivers” - configurations for how and where the BuildKit backend runs. We will be using the remote build driver, which allows Buildx to connect to a manually managed BuildKit daemon.

Seekable OCI (SOCI)

Seekable OCI is an AWS-developed technology for lazy-loading of container images. Instead of pulling the entire image at once before launching a container, Seekable OCI lets you pull only the necessary layers for fastest possible launch time and also prefetch data in the background.

SOCI avoids having to modify existing images by building a separate index artifact (the “SOCI index”), which lives in the remote registry, right next to the image itself.

Concrete implementation of this is the containerd SOCI snapshotter plugin.

Goals

Below is a very high-level architecture diagram of what our stack will look like:

kubernetes cluster
cluster node
cluster node
cluster node
containerd
soci-snapshotter
containerd
soci-snapshotter
containerd
soci-snapshotter
builder stateful set
other workloads
builder stateful set pod
builder stateful set pod
other workload pods
pod containerd
pod soci-snapshotter
buildkit
pod containerd
pod soci-snapshotter
buildkit

BuildKit worker StatefulSet

This article series will show you how to create a Kubernetes StatefulSet for BuildKit workers. Each StatefulSet pod will run a BuildKit daemon with containerd worker configuration and a containerd daemon. It will also attach a persistent volume for BuildKit cache and containerd image store.

BuildKit worker config

Using BuildKit containerd worker (rather than oci) allows for:

  1. Lazy-pulling base images (if base images have available SOCI indexes) with SOCI when building
  2. Avoiding re-pull of build results when generating SOCI indexes. Instead, index generation is setup to run on the same pod that built the image, thus using cached results from containerd image store.
builder podcontainer registrynerdctlcontainerized containerdcontainerized buildkit daemonclicontainer registrynerdctlcontainerized containerdcontainerized buildkit daemoncliloop[build image layers]loop[cache garbage collection]userBuild targetSend build context to buildkitBuild layerLazy-pull baseResultpush imagebuilt image tag/failurebuild SOCI index for tagload container image from local storebuild SOCI indexpush SOCI indexsuccess/failureSuccess/failureGarbage collect cachedelete GC-ed imagesuser

EKS node configuration

EKS nodes will need to be configured to take advantage of images with available SOCI index. This article series will show you how to do so with EC2 user-data, with examples for EC2NodeClass (if using karpenter for scheduling) and terraform aws_launch_template

More