SOCI, buildkit and containerd
This article is part of the Building streamable containers in EKS series.
Background
Managing container images and CI/CD pipelines is par for the course when you’re working with Kubernetes infrastructure. But when machine learning enters the mix, you not only have to deal with long build times for massive container images, but also long pull times when scaling out vertically.
In this post, I’ll share some strategies for tackling both build and pull times using Docker Build and Seekable OCI.
Heres a quick overview of key technologies we’ll use:
Docker build
Docker build is a client-server architecture with Buildx Docker CLI plugin being the client and user interface and BuildKit - the server or builder that will actually be handling build execution.
Both Buildx and BuildKit are installed with Docker Desktop and Docker Engine out-of-the-box.
BuildKit workers
BuildKit daemons (workers) come in 2 flavors: containerd and oci.
containerdworkers rely on containerd runtimer to manage containers and images. Containerd needs to be up-and-running on the host.ociworkers manages containers and images themselves, and containerd is not needed.
Buildx drivers
Buildx additionally implements several “build drivers” - configurations for how and where the BuildKit backend runs. We will be using the remote build driver, which allows Buildx to connect to a manually managed BuildKit daemon.
Seekable OCI (SOCI)
Seekable OCI is an AWS-developed technology for lazy-loading of container images. Instead of pulling the entire image at once before launching a container, Seekable OCI lets you pull only the necessary layers for fastest possible launch time and also prefetch data in the background.
SOCI avoids having to modify existing images by building a separate index artifact (the “SOCI index”), which lives in the remote registry, right next to the image itself.
Concrete implementation of this is the containerd SOCI snapshotter plugin.
Goals
Below is a very high-level architecture diagram of what our stack will look like:
block-beta
columns 3
cluster["kubernetes cluster"]:3
node0["cluster node"]
node1["cluster node"]
node2["cluster node"]
block:cri0=
ctrd0["containerd"]
soci0["soci-snapshotter"]
end
block:cri1
ctrd1["containerd"]
soci1["soci-snapshotter"]
end
block:cri2
ctrd2["containerd"]
soci2["soci-snapshotter"]
end
ss("builder stateful set"):2
ow("other workloads")
pod0("builder stateful set pod")
pod1("builder stateful set pod")
pod2("other workload pods")
block:podcontent0
cctrd0["pod containerd"]
csoci0["pod soci-snapshotter"]
buildkit0["buildkit"]
end
block:podcontent1
cctrd1["pod containerd"]
csoci1["pod soci-snapshotter"]
buildkit1["buildkit"]
end
space
classDef Cluster fill:#997,stroke:#333;
classDef Nodes fill:#999,stroke:#333;
classDef Containerd fill:#98B,stroke:#333;
classDef Soci fill:#98D,stroke:#333;
classDef Buildkit fill:#98E,stroke:#333;
classDef Workload fill:#99C,stroke:#333;
classDef WorkloadPods fill:#99D,stroke:#333;
classDef Statefulset fill:#F99,stroke:#333;
classDef StatefulsetPods fill:#D99,stroke:#333;
class cluster,node0,node1,node2 Cluster
class node0,node1,node2 Nodes
class ctrd0,ctrd1,ctrd2,cctrd0,cctrd1 Containerd
class soci0,soci1,soci2,csoci0,csoci1 Soci
class ow Workload
class pod2 WorkloadPods
class ss Statefulset
class pod0,pod1 StatefulsetPods
class buildkit0,buildkit1 Buildkit
BuildKit worker StatefulSet
This article series will show you how to create a Kubernetes StatefulSet for BuildKit workers. Each StatefulSet pod will run a BuildKit daemon with containerd worker configuration and a containerd daemon. It will also attach a persistent volume for BuildKit cache and containerd image store.
BuildKit worker config
Using BuildKit containerd worker (rather than oci) allows for:
- Lazy-pulling base images (if base images have available SOCI indexes) with SOCI when building
- Avoiding re-pull of build results when generating SOCI indexes. Instead, index generation is setup to run on the same pod that built the image, thus using cached results from containerd image store.
sequenceDiagram
actor user
participant cli
box builder pod
participant buildkit as containerized buildkit daemon
participant containerd as containerized containerd
participant nerdctl
end
participant registry as container registry
user->>cli: Build target
cli->>buildkit: Send build context to buildkit
loop build image layers
buildkit->>containerd: Build layer
containerd->>containerd: Lazy-pull base
containerd->>buildkit: Result
end
buildkit->>registry: push image
buildkit->>cli: built image tag/failure
cli->>nerdctl: build SOCI index for tag
nerdctl->>containerd: load container image from local store
nerdctl->>nerdctl: build SOCI index
nerdctl->>registry: push SOCI index
nerdctl->>cli: success/failure
cli->>user: Success/failure
loop cache garbage collection
buildkit->>buildkit: Garbage collect cache
buildkit->>containerd: delete GC-ed images
end
EKS node configuration
EKS nodes will need to be configured to take advantage of images with available SOCI index. This article series will show you how to do so with EC2 user-data, with examples for EC2NodeClass (if using karpenter for scheduling) and terraform aws_launch_template
More
- Next: BuildKit containerd worker image
- Check out the complete Building streamable containers in EKS series