Air Gapped Kubernetes With Talos Linux

Air gapped environments have always been seen as highly secure and difficult to manage. They keep your systems away from accidental or malicious network attacks at the cost of running and maintaining more infrastructure.

Historically, users would need to stand up a container registry with trusted certificates and pull images onto a portable device (e.g. engineer’s laptop) and push them into the environment where they were going to create a cluster. This was all needed before they could even test if Kubernetes was the right solution for them.

Creating this infrastructure was only part of the problem because now someone needed to understand and maintain all of the image caches, registries, certificates, and hosts where they all run.

Talos Linux runs all components in containers. This includes standard Kubernetes components like the Kubelet and system services like the Talos installer.

This is great for flexibility and portability, but makes it difficult to pre-package everything you need to create a Kubernetes cluster if you don’t have Internet access or your connection is unreliable or slow. These environments also make it hard to pull application images or large images from remote repositories.

Talos Linux 1.9 introduces a new feature for pre-seeding container images as part of the Talos installation media. The image cache takes a few extra steps to pre-seed with images, but eliminates the requirement to run additional registries or maintain hosts outside of the Kubernetes cluster.

Let’s walk through the steps to install Kubernetes in an air gapped environment. If you want to watch the process in a video check out this video below.

Create Installation Media

To get started you’ll need to have talosctl and docker installed on your computer and you’ll need a USB drive to perform the installation.

Make sure you have the latest version of talosctl as you’ll need at least version 1.9 to use the image cache.

brew install siderolabs/tap/sidero-tools

Get a list of the default images with:

talosctl images default > images.txt

If you want to add any other required images or application images you can append them to this text file. Things like monitoring agents, storage drivers, or base images for your applications will reduce the amount of bandwidth required after Talos is provisioned.

Now we’ll download all of the container images in to a folder. If any of your images need authentication to the registry you’ll need to set that up in docker before downloading the cache.

cat images.txt | \
  talosctl images cache-create \
  --image-cache-path ./image-cache.oci \
  --images=-

Now we use the image cache directory to create a Talos installation ISO.

mkdir -p _out/

docker run --rm -t \
  -v $PWD/_out:/secureboot:ro \
  -v $PWD/_out:/out \
  -v $PWD/image-cache.oci:/image-cache.oci:ro \
  -v /dev:/dev --privileged \
    ghcr.io/siderolabs/imager:v1.9.0 iso \
      --image-cache /image-cache.oci

This runs the Talos imager container and creates a metal-amd64.iso in the _out/ directory. If you want to customize the installation media with kernel args or a platform you can see the --help output from the imager container.

Copy to USB

Although we’re making a CD-ROM image it’s probably rare that your systems have an actual CD-ROM drive or you are using one to install Talos. This ISO will work with virtual media on various KVM and out-of-band options or VM environments like Proxmox.

To copy the ISO to a USB drive we’ll have to use dd to make sure the filesystem that get’s mounted inside Talos is an ISO and not a drive. You’ll also want to make sure the drive is empty and all partitions are wiped before copying the ISO.

sudo dd if=_out/metal-amd64.iso of=/dev/$YOUR_USB_DEVICE

Now if you plug the USB drive into your computer in should have the label TALOS_V1_9_0. This is important because the installer will look for an image cache on an ISO with the label TALOS_.

Install Talos

Now all of your images are on the USB device you can boot your system from the drive. Talos should boot as normal and go into maintenance mode like normal.

When you send the machine config to the machine you’ll need to tell Talos to use an image cache and you’ll need to set up an IMAGECACHE volume so it knows where to copy images to on the local drive.

machine:
  features:
    imageCache:
      localEnabled: true
---
apiVersion: v1alpha1
kind: VolumeConfig
name: IMAGECACHE
provisioning:
  diskSelector:
    match: 'system_disk'

The default volume size is 1 GiB so if you included additional images you may need to increase the volume size. See the disk management documentation.

If the device is air gapped you will also want to disable NTP sync or configure it to an internal time server. You should also disable the discovery service or run a Talos discovery service internally.

Note: If you run Omni internally it comes with an embedded discovery service.

machine:
    time:
        disabled: true
cluster:
    discovery:
        enabled: false

With these patches you’re ready to apply the configuration to your Talos system and bootstrap a cluster.

talosctl apply -f controlplane.yaml -p '@patch.yaml' -n $NODE -i

Verify Image Cache

When your machine get’s provisioned you’ll see the logs show information about “Waiting for image cache.” This step will format the volume and copy the image cache files from the ISO to the new partition. After the copy is done a local, pull-through image cache service is run called registryd.

Once Talos is installed containerd will be automatically configured for the registryd endpoint and the cached images will pull through that registry.

You can validate the image cache config via the Talos API

talosctl get imagecacheconfig -o yaml
node: 192.168.4.26
metadata:
    namespace: cri
    type: ImageCacheConfigs.cri.talos.dev
    id: image-cache
    version: 4
    owner: cri.ImageCacheConfigController
    phase: running
    created: 2024-12-18T00:54:36Z
    updated: 2024-12-18T18:32:07Z
spec:
    status: ready
    copyStatus: ready   # Shows files were copied from the ISO
    roots:              # Priority list for where cache is pulled
        - /system/imagecache/disk
        - /system/imagecache/iso/imagecache

Conclusion

That’s all you need to do to install Kubernetes in an air gapped environment. No more bridge registries, cert management, or manual container image syncing. This features allows you to update images on the USB drive and automatically sync those into the Talos image cache if needed.

This also helps for low bandwidth and connections that are not stable. Reduce your dependency on a remote registry and internet connection by baking in the images you need.

Give it a try and let us know if you have feedback.

Subscribe!

Occasional Updates On Sidero Labs, Kubernetes And More!

Hobby

For home labbers
$ 10 Monthly for 10 nodes
  • Includes 10 nodes in base price
  • Limited to 10 nodes, 1 user
  • Community Support

Startup

Build right
$ 250 Monthly for 10 nodes
  • Includes 10 nodes in base price
  • Additional nodes priced per node, per month
  • Scales to unlimited Clusters,
    Nodes and Users
  • Community Support

Business

Expert support
$ 600 Monthly for 10 nodes
  • Volume pricing
  • Scales to unlimited Clusters,
    Nodes and Users
  • Talos Linux, Omni and Kubernetes support from our experts
  • Business hours support with SLAs
  • Unlimited users with RBAC and SAML

Enterprise

Enterprise Ready
$ 1000 Monthly for 10 nodes
  • Business plan features, plus...
  • Volume pricing
  • 24 x 7 x 365 Support
  • Fully Managed Option
  • Can Self Host
  • Supports Air-Gapped
  • Private Slack Channel
On Prem
available

Edge

Manage scale
$ Call Starting at 100 nodes
  • Pricing designed for edge scale
  • 24 x 7 x 365 Support with SLAs
  • Only outgoing HTTPS required
  • Secure node enrollment flows
  • Reliable device management
  • Can Self Host On Prem
  • Private Slack Channel
On Prem
available