Loading Kubernetes Types Into Go Objects

At Cloudant, we use GitOps to manage our Kubernetes workloads. One of the advantages of this approach is that we store fully-rendered Kubernetes manifests within GitHub for deployment to our Kubernetes clusters.

One thing that I often find myself doing is writing small one-off tools to answer questions about those manifests. For example, “by deployment, what is the CPU and memory resource allocation, and how much does that cost in terms of worker machine price?”. As a first approximation, this can be discovered by loading up all Deployment manifests from our GitOps repository, then processing their content to discover container resource requests and the number of replicas specified by each Deployment.

I write these ad hoc tools in Go. While I could create the appropriate struct definitions within each program for YAML deserializer to work with, it is time consuming to do the object mapping work in every application. I wanted to be able to use pre-created object mappings and load them up inside my applications.

For this, I looked to the Kubernetes Go client. While this client contains object mappings for the various standard Kubernetes resource types, it is designed for use when querying the Kubernetes API server rather than loading YAML files from disk. With a little digging into the client’s guts, you can make this work, however, and save yourself a bunch of time. In addition, any code you do write can be easily converted to using the output of the Kubernetes API server later because it is working with the same types.

Go packages for Kubernetes API types

k8s.io/api is the root package for the standard type objects, such as Deployment. One thing that tripped me up originally is that the types are not defined within that package itself, but instead in packages underneath that namespace. For example, Deployment is in the k8s.io/api/apps/v1 package. The sub-packages follow a pattern of k8s.io/api/GROUP/VERSION, where GROUP and VERSION can be found in the apiVersion of all Kubernetes resources.

Each version of Kubernetes has a tag within the source repository for the Go k8s.io/api package. This is a useful way to refer to the API version you require within your go.mod file, rather than the package version.

To use these object-mappings within your application, add a require to go.mod using the tag to pin the version:

require (
	k8s.io/api v0.19.0
)

In this require line, v0.19.0 refers to Kubernetes API level 1.19. I’m guessing that they chose not to use v1.19.0 as the Go module guidelines specify API stability within major versions. By my experience, this internal API changes quite often with version bumps, and so it makes sense to use the v0.x versioning to capture this expectation.

Loading YAML into Go objects

Now we know the package for the types, we need to work out how to load YAML into these types. I found that the definitions within k8s.io/api didn’t play well with the usual YAML library that I use, gopkg.in/yaml.v2.

It turns out that the k8s.io/apimachinery package contains stuff to help with loading the YAML, and we can combine that with utilities in k8s.io/client-go to decode YAML read from disk into rich Go types.

Again, we use the git tags for given Kubernetes versions when writing requirements for these two packages into our go.mod file. Because we’re using semi-private implementation packages for Kubernetes, if the versions of each package don’t “match” with each other, my experience is that loading YAML will fail with various odd error messages.

To use these packages, we modify the go.mod for the application to further include references to k8s.io/apimachinery and k8s.io/client-go at the same API level as k8s.io/api:

require (
	k8s.io/api v0.19.0
	k8s.io/apimachinery v0.19.0
	k8s.io/client-go v0.19.0
)

The process of mapping YAML to objects

From k8s.io/apimachinery, we use a UniversalDeserializer to create a runtime.Decoder object which is able take YAML and deserialize it into the objects that k8s.io/api provides. The runtime.Decoder is generated using a Scheme object which needs to contain all the definitions for the resource types (schemas, perhaps?) that need to be deserialized. By default, the Scheme object is empty of definitions.

This is where k8s.io/client-go comes in. To avoid needing to load all the types ourselves, we usek8s.io/client-go/kubernetes/scheme#Codecs from k8s.io/client-go as the CodecFactory assigned to this field has all the standard Kubernetes types preloaded into its Scheme and provides a way to create a runtime.Decoder using that Scheme.

Once we have a runtime.Decoder, we use its Decode method to decode YAML buffers into Go objects. Decode returns a triple:

(Object, *schema.GroupVersionKind, error)

The Object is the decoded object, as a k8s.io/apimachinery runtime.Object. This is an interface that needs to be cast to the appropriate type from k8s.io/api in order to access the resource’s fields. The GroupVersionKind structure helps us to do that, as it fully describes the Kubernetes resource type:

if groupVersionKind.Group == "apps" &&
	groupVersionKind.Version == "v1" &&
	groupVersionKind.Kind == "Deployment" {

	// Cast appropriately
	deployment := obj.(*appsv1.Deployment)

	// And do something with it
	log.Print(deployment.ObjectMeta.Name)
}

Putting it all together

Once we have these packages ready for use, reading the YAML into objects is relatively simple.

Load the YAML file into a buffer.
Handle that most YAML manifests are in fact many YAML documents in a single file, separated via ---. A naive but effective way to do this is using strings.Split().
Pass to k8s.io/apimachinery’s runtime.Decoder which loads Kubernetes objects into runtime.Object which is an interface for all API types.
Once loaded, figure out from the GroupVersionKind returned by Decode what Kubernetes resource kind we have and cast appropriately.

This code brings this together:

import (
	"fmt"
	"io/ioutil"
	"log"
	"path"
	"strings"

	appsv1 "k8s.io/api/apps/v1"
	"k8s.io/client-go/kubernetes/scheme"
)

func main() {

	// Load the file into a buffer
	fname := "/path/to/my/manifest.yaml"
	data, err := ioutil.ReadFile(fname)
	if err != nil {
		log.Fatal(err)
	}

	// Create a runtime.Decoder from the Codecs field within
	// k8s.io/client-go that's pre-loaded with the schemas for all
	// the standard Kubernetes resource types.
	decoder := scheme.Codecs.UniversalDeserializer()

	for _, resourceYAML := range strings.Split(string(data), "---") {

		// skip empty documents, `Decode` will fail on them
		if len(resourceYAML) == 0 {
			continue
		}

		// - obj is the API object (e.g., Deployment)
		// - groupVersionKind is a generic object that allows
		//   detecting the API type we are dealing with, for
		//   accurate type casting later.
		obj, groupVersionKind, err := decoder.Decode(
			[]byte(resourceYAML),
			nil,
			nil)
		if err != nil {
			log.Print(err)
			continue
		}

		// Figure out from `Kind` the resource type, and attempt
		// to cast appropriately.
		if groupVersionKind.Group == "apps" &&
			groupVersionKind.Version == "v1" &&
			groupVersionKind.Kind == "Deployment" {
			deployment := obj.(*appsv1.Deployment)
			log.Print(deployment.ObjectMeta.Name)
		}
	}
}

Remaining work: loading extension types (CRDs)

I haven’t worked out how to add CRD types to this process yet, which is a gap because we use CRDs with our own operators to deploy more complicated parts of our stack. As yet, therefore, they are missed out of analysis from my ad hoc tools.

In the end, you have to call AddKnownTypes on a given Scheme to add types, such as in the code for Sealed Secrets. This means you either need to reference or copy in the type definitions for your types from appropriate packages – much like client-go does when registering the types from k8s.io/api.

But, as yet, I’ve not completed this final part of the story.

Summary

This pattern is particularly useful when you are deploying using a GitOps pattern and thus have full manifests stored outside of Kubernetes which you wish to analyse. I have used it in several places; indeed the reason I’m writing it up here is to capture the instructions as much for myself as for other people, and also to force myself to dig into the code further rather than just copy-paste code from other places.

Most of the discoveries for this post come from these client-go GitHub issues:

Support for parsing K8s yaml spec into client-go data structures contains a lot of useful source code snippets for this task.
Building with go modules suddenly returns ‘cannot find module providing package’ contains details of how to safely specify the require in go.mod.