What?

One of the first questions I hear when people introduce Kubernetes is “but why?”. So let’s look at the actual why, the reasons someone would create this monstrosity interesting piece of software.

Why would we create Kubernetes?

Hi, I’m a systems administrator at Google. I want to:

  • Manage services running on computers
  • Manage shared resources between those services
  • Manage the networking/firewall rules of those services
  • Containerise those services so they are reproducible across machines
  • I want create role based access controls over all these actions
  • I want to define resource limits for services
  • I want to dynamically scale services according to resource usage
  • I want to log all services and make those logs accessible
  • I want to define custom configurations for the scheduling of these services
  • I want to pass secrets into machines
  • I want to OH GOD PLEASE MAKE IT STOP

Foundational Concepts

Cluster

A Kubernetes cluster is a set of nodes that are managed by different Kubernetes components to run containerized applications.

Pods (read ‘Containers’, but like, grouped)

Pods are the most common resource you will see inside a cluster, since they encapsulate one (or more) containers used to run a single workload. In the real world™ pods will often be created by other resources (such as DaemonSets, ReplicaSets and Deployments). This will cause you to see pod names like podname-is897 or podname-dy364.

Nodes

These are the machines (physical or virtual) that will be running the pods for the cluster. Each machine has a running service called kubelet that ensures that the node is actually running and configured as intended.

Services

Services are a nice way of naming and exposing services that are running on a set of a pods inside the cluster. Since pods are non-permanent resources, referring to a pod by ip address would cause issues. Services define a set of pods, and a policy by which to access them. Services are given a unique DNS name.

Namespaces

Namespaces are used to logically segment the cluster into different applications or workflows. For example:

  • myapp is a namespace for the application MyApp
  • myapp-ci is a namespace for our internal CI/CD1 build workflow for MyApp
  • test is a namespace used for testing configurations

By default, Kubernetes will always have these namespaces:

  • default - Allows you to get started quickly after deploying a cluster.
  • kube-node-lease - Holds short-term shared-resource locks and allows coordination between nodes.
  • kube-public - This namespace is readable by all clients. Usually reserved for resources that need to be visible and publicly readable through the whole cluster.
  • kube-system - Namespace for objects used to operate the cluster that are created by Kubernetes.

Secrets

Kubernetes has an in-built mechanism to provide API keys and credentials to applications. They usually contain goodies for attackers, and the default implementation lacks stuff you would expect like encryption at rest.

Random Kubernetes Security Information

Namespace network controls

Namespaces are great ways of segmenting out your pods and other resources. But it doesn’t really do anything on the networking side. In fact, to quote the CISA Kubernetes Hardening Guide2:

By default, no network policies are applied to Pods or namespaces, resulting in unrestricted ingress and egress traffic within the Pod network

Basically, all your pods can talk to each other by default. Dev? Prod? all of it can just talk 😄.

Service information in /proc/self/environ

You know that cheeky file that stores environment variables? Well Kubernetes puts some cool stuff in it :).

Funnily enough Microsoft has good docs on this and Kubernetes doesn’t 🤷‍♂️.

Basically, Kubernetes automatically creates environment variables that give the internal IP addresses and ports of the services inside the cluster.

Kubernetes Goat

One of my favourite community projects is Kubrnetes Goat, which is kind of like the Damn Vulnerable Web App or Juice Shop of Kubernetes.

The only problem is that to deploy Kubernetes Goat we either need to use a cloud service (this can be very expensive, but I’ve heard good things about this guide for finding cheaper providers), or we can self-host which can be hard.

To help you get up and running ASAP, I’ve been working on using automation tools, specifically Ansible and Vagrant, to create a locally-hosted cluster.

The only problem here is that Kubernetes and Arch have breaking changes fairly34 often and I have to keep updating the scripts 😢.

Getting a test cluster up and running

First we need to grab a copy of the repository:

git clone https://gitlab.com/f3rn0s/kubernetes-cluster && cd "kubernetes-cluster"

Then we need to run Vagrant which usually takes a little while, but it will create the virtual machines, and install the various tools needed:

vagrant up

Finally we need to set-up networking and join the nodes to the cluster, which is done via a shell script located on the kubemaster machine:

vagrant ssh kubemaster
./deploy.sh

And that’s it, we have a fully™ working cluster.

Deploying Kubernetes Goat onto the test cluster

vagrant ssh kubemaster
cd /vagrant/custom-manifests/kubernetes-goat
bash setup-kubernetes-goat.sh

Then we need to wait a little while… (for all the different pods to be downloaded, scheduled and run). Then:

bash access-kubernetes-goat.sh