I know it’s a click bait title, and you clicked it anyway. So let’s just pretend you want to get an interactive terminal on a Talos Linux node. I’m not sure why you’d want to do that—the API is more powerful.
But Talos doesn’t have SSH. Not only does it not have the service running—it doesn’t even have the sshd
binary on the file system. Whatever the reason, let’s pretend you need a shell “on” the host.
Maybe you have too much muscle memory for netstat
or maybe you just hate declarative APIs. Perhaps you want to impress your co-workers with your 1337 hacker skillz™.
Whatever the case may be, Talos Linux doesn’t require that knowledge. But if you absolutely must have it, this command will get you a “secure shell” on a Talos node.
kubectl debug -n kube-system -it --image alpine node/$NODE
This uses kubectl debug
to launch a temporary pod with the alpine container. It will run the container with host level namespaces for processes and network. The root file system is mounted at /host
inside the container. You can explore the file system (there’s not much there) and maybe use some of your familiar tools to debug something.
But what can you do from there?
One legitimate use case might be you lost your talosconfig for your host, but you still have an admin level kubeconfig. In that case you can read the MachineConfig from a control plane node in your cluster with:
kubectl debug -n kube-system -it --image alpine node/$NODE -- \
cat /host/system/state/config.yaml > machineconfig.yaml
With the `machineconfig.yaml` file you can recreate a secrets.yaml file with:
talosctl gen secrets --from-controlplane-config ./machineconfig.yaml
From here you can artisanally hand craft a talosconfig file. This obviously isn’t as good as having proper backups for important files, but it might save you in a pinch.
What’s a better way?
Of course, at Sidero Labs, we built Talos Linux to remove the need for humans to manage Linux. There is no shell installed on Talos Linux and the common troubleshooting tasks can be done via the API.
Let’s look at how some common debugging questions can be answered from the API and talosctl
instead of SSH.
systemctl status
Instead of using systemd to see what services are running you can use:
talosctl services -n $NODE
Which will output the services on a node and their health. There are so few services running on Talos Linux they fit in a single page.
NODE SERVICE STATE HEALTH LAST CHANGE LAST EVENT
192.168.6.62 apid Running OK 4h10m29s ago Health check successful
192.168.6.62 containerd Running OK 4h11m19s ago Health check successful
192.168.6.62 cri Running OK 4h10m30s ago Health check successful
192.168.6.62 dashboard Running ? 4h10m36s ago Process Process(["/sbin/dashboard"]) started with PID 2255
192.168.6.62 etcd Running OK 4h10m24s ago Health check successful
192.168.6.62 kubelet Running OK 4h10m27s ago Health check successful
192.168.6.62 machined Running OK 4h11m24s ago Health check successful
192.168.6.62 trustd Running OK 4h10m29s ago Health check successful
192.168.6.62 udevd Running OK 4h11m20s ago Health check successful
From here you can tail the logs of any services with:
talosctl logs $SERVICE -n $NODE
If you need kernel level logs you can use the dashboard (which has other information) or dmesg
talosctl dmesg -n $NODE
docker ps
Sometimes you just want to check what containers are running. This will show you the system level containers. If you want to see Kubernetes containers you can add -k
too look inside the k8s.io namespace.
talosctl containers -n $NODE
Output will look something like this:
NODE NAMESPACE ID IMAGE PID STATUS
192.168.6.62 system apid 2306 RUNNING
192.168.6.62 system trustd 2383 RUNNING
You can also run something similar to docker top
with:
talosctl stats -n $NODE
ps
Even in Talos Linux, not everything runs in a container. Maybe you want to see all the processes on a node. This will include Kubernetes and non-Kubernetes processes
talosctl processes -n $NODE
netstat
Everyone loves to hate netstat. Even though it was deprecated a long time ago it still lives on. You can get familiar netstat output via the API with
talosctl netstat -n $NODE
tcpdump
At some point you just need to look at the network traffic. Of course you can do that from the API. This one is also aliased to tcpdump
.
talosctl pcap -n $NODE
Next time you think you need SSH access to a server you should ask yourself why. Why are you required to interact with a terminal to run imperative commands? Why can’t Linux work like other declarative systems APIs? Maybe it’s time to look for something better.