Kubernetes by Types

It’s relatively easy to find articles online about the basics of Kubernetes that talk about how Kubernetes looks on your servers. That a Kubernetes cluster consists of master nodes (where Kubernetes book-keeping takes place) and worker nodes (where your applications and some system applications run). And that to run more stuff, you provision more workers, and that each pod looks like its own machine. And so on.

But for me, I found a disconnect between that mental image of relatively clean looking things running on servers and the reams and reams of YAML one must write to seemingly do anything with Kubernetes. Recently, I found the Kubernetes API overview pages. Somehow I’d not really internalised before that the reams of YAML are just compositions of types, like programming in any class-based language.

But they are, because in the end all the YAML you pass into kubectl is just getting kubectl to work with a data model inside the Kubernetes master node somewhere. The types described in the Kubernetes API documentation are the building blocks of that data model, and learning them unlocked a new level of understanding Kuberentes for me.

The data model is built using object composition, and I found a nice way to discover it was to start from a single container object and build out to a running deployment, using the API documentation as much as I could but returning to the prose documentation for examples when I got stuck or, as we’ll see with ConfigMaps, when the API documentation just can’t describe everything you need to know.

Containers

This is our starting point. While the smallest thing that Kubernetes will schedule on a worker is a Pod, the basic entity is the Container, which encapsulates (usually) a single process running on a machine. Looking at the API definition, we can easily see what the allowed values are – for me this was the point where what had previously been seemingly arbitrary YAML fields started to slot together into a type system! Just like other API documentation, suddenly there’s a place where I can see what goes in the YAML rather than copy-pasting things from the Kubernetes prose documentation, tweaking it and then just having to 🤞.

Let’s take a quick look at some fields:

The most important thing for a Container is, of course, the image that it will run. From the Container API documentation, we can look through the table of fields within the Container and see that a string is required for this field.
The documentation also says that a name is also required.
Another field that crops up a lot in my copy-pasted YAML is imagePullPolicy. If we look at imagePullPolicy, we can see that it’s also a string but also the documentation states what the acceptable values are: Always, Never and IfNotPresent. If YAML allowed enums, I’m sure this would be an enum. Anyway, we can immediately see what the allowed values are – this is much easier than trying to find this within the prose documentation!
Finally, let’s take a look at volumeMounts, which is a little more complicated: it’s a field of a new type rather than a primitive value. The new type is VolumeMount and the documentation tells us that this is an array of VolumeMount objects and links us to the appropriate API docs for VolumeMount objects. This was the real moment when I stopped having to use copy-paste and instead was really able to start constructing my YAML – 💪!

The documentation is also super-helpful in telling us where we can put things. Right at the top of the Container API spec, it tells us:

Containers are only ever created within the context of a Pod. This is usually done using a Controller. See Controllers: Deployment, Job, or StatefulSet.

Totally awesome, we now know that we need to put the Container within something else for it to be useful!

So let’s make ourselves a minimal container:

name: haproxy
image: haproxy:2.1.0
imagePullPolicy: IfNotPresent
volumeMounts:
  name: HAProxyConfigVolume  # References a containing PodSpec
  mountPath: /usr/local/etc/haproxy/
  readOnly: true

We can build all this from the API documentation – and it’s easy to avoid the unneeded settings that often come along with copy-pasted examples from random websites on the internet. By reading the documentation for each field, we can also get a much better feel for how this container will behave, making it easier to debug problems later.

Pods

So now we have our Container we need to make a Pod so that Kubernetes can schedule HAProxy onto our nodes. From the Container docs, we have a link direct to the PodSpec documentation. Awesome, we can follow that up to our next building block.

A PodSpec has way more fields than a Container! But we can see that the first one we need to look at is containers which we’re told is an array of Container objects. And hey we have a Container object already, so let’s start our PodSpec with that:

containers:
- name: haproxy
  image: haproxy:2.1.0
  imagePullPolicy: IfNotPresent
  volumeMounts:
    name: HAProxyConfigVolume  # References a containing PodSpec
    mountPath: /usr/local/etc/haproxy/
    readOnly: true

Now, we also have that VolumeMount object in our HAProxy container that’s expecting a Volume from the PodSpec. So let’s add that. The Volume API spec should help and from the PodSpec docs we can see that a PodSpec has a volumes field which should have an array of Volume objects.

Looking at the Volume spec, we can see that it’s mostly a huge list of the different types of volumes that we can use. Each of which links off to yet another type which describes that particular volume. One thing to note is that the name of the Volume object we create needs to match the name of the VolumeMount in the Container object. Kubenetes has a lot of implied coupling like that, it’s just something to get used to.

We’ll use a configMap volume (ConfigMapVolumeSource docs) to mount a HAProxy config. We assume that the ConfigMap contains whatever files that HAProxy needs. Here’s the PodSpec with the volumes field:

containers:
- name: haproxy
  image: haproxy:2.1.0
  imagePullPolicy: IfNotPresent
  volumeMounts:
    mountPath: /usr/local/etc/haproxy/
    name: HAProxyConfigVolume  # This name comes from the PodSpec
    readOnly: true
volumes:
- name: HAProxyConfigVolume
  configMap:
    name: HAProxyConfigMap  # References a ConfigMap in the cluster

So now what we have is a PodSpec object which is composed from an array of Container objects and and array of Volume objects. To Kubernetes, our PodSpec object is a “template” for making Pods out of — we further need to embed this object inside another object which describes how we want to use this template to deploy one or more Pods to our Kubernetes cluster.

Deployments

There are several ways to get our PodSpec template actually made into a running process on the Kubernetes cluster. The ones mentioned all the way back in the Container docs are the most common:

Deployment: run a given number of Pod resources, with upgrade semantics and other useful things.
Job and CronJob: run a one-time or periodic job that uses the Pod as its executable task.
StatefulSet: a special-case thing where Pods get stable identities.

Deployment resources are most common, so we’ll build one of those. As always, we’ll look to the Deployment API spec to help. An interesting thing to note about Deployment resources is that the docs have a new set of options in the sidebar underneath the Deployment heading – links to the API calls in the Kubernetes API that we can use to manage our Deployment objects. Suddenly we’ve found that Kubernetes has a HTTP API we can use rather than kubectl if we want — time for our 🤖 overlords to take over!

Anyway, for now let’s keep looking at the API spec for what our Deployments need to look like; whether we choose to pass them to either kubectl or these new shiny API endpoints we just found out about.

Deployment resources are top-level things, meaning that we can create, delete and modify them using the the Kubernetes API — up until now we’ve been working with definitions that need to be composed into higher level types to be useful. Top level types all have some standard fields:

apiVersion: this allows us to tell Kubernetes what version of the API we are using to manage this Deployment resource; as in any API, different API versions have different fields and behaviours.
kind: this specifies the kind of the resource, in this case Deployment.
metadata: this field contains lots of standard Kubernetes metadata, and it has a type of its own, ObjectMeta. The key thing we need here is the name field, which is a string.

Specific to a deployment we have just one field to look at:

spec: this describes how the Deployment will operate (e.g., how upgrades will be handled) and the Pod objects it will manage.

If we click kubectl example in the API spec, the API docs show a basic Deployment. From this, we can see the values we need to use for apiVersion, kind and metadata to get us started. A first version of our Deployment looks like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: haproxy-load-balancer
spec:
  # TODO

Next we’ll need to look at the DeploymentSpec API docs to see what we need to put into there. From experience, the most common fields here are:

template: a PodTemplateSpec which contains a standard metadata field containing ObjectMeta (the same type as at the top-level of the Deployment!) and a spec field where we finally find place to put the PodSpec we made earlier. This field is vital, as without it the Deployment has nothing to run!
selector: this field works with the metadata in the template field to tell the Deployment’s controller (the code within Kubernetes that manages Deployment resources) which Pods are related to this Deployment. Typically it references labels within the PodTemplateSpec’s metadata field. The selector documentation talks more about how selectors work; they are used widely within Kubernetes.
replicas: optional, but almost all Deployments have this field; how many Pods should exist that match the selector at all times. 3 is a common value as it works well for rolling reboots during upgrades.

We can add a basic DeploymentSpec with three replicas that uses the app label to tell the Deployment what Pods it is managing:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: haproxy-load-balancer
spec:
  replicas: 3
  selector:
    matchLabels:
        app: haproxy
  template:
    metadata:
      labels:
        app: haproxy
    spec:
        # PodSpec goes here

Finally, here is the complete Deployment built from scratch using the API documentation. While I think it would be pretty impossible to get here from the API documentation alone, once one has a basic grasp of concepts like “I need a Deployment to get some Pods running”, reading the API docs alongside copy-pasting YAML into kubectl is most likely a really fast way of getting up to speed; I certainly wish I’d dived in to the API docs a few months before I did!

apiVersion: apps/v1
kind: Deployment
metadata:
  name: haproxy-load-balancer
spec:
  replicas: 3
  selector:
    matchLabels:
      app: haproxy
  template:
    metadata:
      labels:
        app: haproxy
    spec:
      containers:
      - name: haproxy
        image: haproxy:2.1.0
        imagePullPolicy: IfNotPresent
        volumeMounts:
          mountPath: /usr/local/etc/haproxy/
          name: HAProxyConfigVolume
          readOnly: true
        volumes:
      - name: HAProxyConfigVolume
        configMap:
          name: HAProxyConfigMap

ConfigMaps

For completeness, let’s get a trivial HAProxy configuration and put it inside a ConfigMap resource so this demonstration is runnable. The API documentation for ConfigMap is less helpful than we’ve seen so far, frankly.

We can see ConfigMap objects can be worked with directly via the API, as they have the standard apiVersion, kind and metadata fields we saw on Deployment objects.

HAProxy configuration is a text file, so we can see that it probably goes in the data field rather than the binaryData field, as data can hold any UTF-8 sequence. We can see that data is an object, but further than that there isn’t detail about what should be in that object.

In the end, we need to go and check out the prose documentation on how to use a ConfigMap to understand what to do. Essentially what we find is that the keys used in the data object are used in different ways based on how we are using the ConfigMap. If we choose to mount the ConfigMap into a container — as we do in the PodSpec above — then the keys of the data object become filenames within the mounted filesystem. If, instead, we set up the ConfigMap to be used via environment variables, the keys would become the variable names. So we need to know this extra information before we can figure what to put in that data field.

The API documentation often requires reading alongside the prose documentation in this manner as many Kubernetes primitives have this use-dependent aspect to them.

So in this case, we add a haproxy.cfg key to the data object, as the HAProxy image we are using by default will look to /usr/local/etc/haproxy/haproxy.cfg for its configuration.

apiVersion: v1
kind: ConfigMap
metadata:
    name: HAProxyConfigMap  # Match name in VolumeMount
data:
    haproxy.cfg: |
        defaults
            mode http

        frontend normal
            bind *:80
            default_backend normal

        backend normal
            server app webapp:8081  # Assumes webapp Service

Recall from Just enough YAML that starting an object value with a | character makes all indented text that comes below into a single string, so this ConfigMap ends up with a file containing the HAProxy configuration correctly.

Summary

So we now have a simple HAProxy deployment in Kubernetes which we’ve mostly been able to build from reading the API documentation rather than blindly copy-pasting YAML from the internet. We — at least I — better understand what’s going on with all the bits of YAML and it’s starting to feel much less arbitrary. I feel now like I might actually stand a chance of writing some code that calls the Kubernetes API rather than relying on YAML and kubectl. And what’s that code called? An operator! I’d heard the name bandied about a lot, but had presumed some black magic was involved — but nope, it’s just about calls that manipulate objects within the Kubernetes API using the types we’ve talked about above, along with about a zillion other ones, including ones you make up yourself! Obviously you need to figure out how best to manage the objects, but when all is said and done that’s what you are doing.

Anyway, hopefully this has de-mystified some more of Kubernetes for you, dear reader; as I mentioned understanding these pieces helped me go from a copy-paste-hope workflow towards a much less frustrating experience building up my Kubernetes resources.