TIL: Run a script on every k8s Node using a DaemonSet

I’ve known that DaemonSets are used to run containers on all Nodes of a Kubernetes Cluster (or some), but I’ve never thought of using them to run a (shell) script on each node – a not-so-uncommon task when maintaining clusters!

What we need

We need two resources: A ConfigMap and a DaemonSet.

ConfigMap

The config map holds the script, which will be mounted into the container started by the DaemonSet.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-script
  namespace: kube-system
data:
  my-script.sh: |
    #!/usr/bin/env bash
    while true; do
       echo "hello world!"
       sleep 60
    done

The data is just the shell script we want to execute.

DaemonSet

The DaemonSet is used to run a busybox container on each node which is used to execute the (shell) script specified in the ConfigMap – in the last lines the config map is mounted as my-script.sh file, and marked as readable and executable (0555, or r-xr-xr-x). It is then used as command for the busybox container.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: my-script
  namespace: kube-system
  labels:
    k8s-app: my-script
spec:
  selector:
    matchLabels:
      name: my-script
  template:
    metadata:
      labels:
        name: my-script
    spec:
      hostPID: true
      containers:
        - name: my-script
          securityContext:
            privileged: true
          image: busybox:1.36.0
          command: ["/my-script.sh"]
          resources:
            requests:
              cpu: 10m
              memory: 50Mi
          volumeMounts:
            - name: my-script-script
              mountPath: /my-script.sh
              subPath: my-script.sh
      volumes:
        - name: my-script-script
          configMap:
            name: my-script-script
            defaultMode: 0555

Use case

At work I encountered an issue with the AWS EFS CSI driver, a component which mounts and unmounts EFS volumes in a Kubernetes cluster. Occasionally the EFS mount would become unresponsive and be in a “Zombie-like” state, where the EFS driver still believed the volume was mounted and healthy when it actually wasn’t.

Turns out, the issue was with stunnel, a SSL tunnel used by EFS/the EFS driver. The solution for the problem was to kill the stunnel processes when the mount can no longer be accessed which makes the EFS process re-create the tunnel and mount. Fun times!

Links