Some parts of a Kubernetes CI with the Cue language

tl;dr Some notes on an experiment I did to implement a webhook web service for a Kubernetes based CI with a few lines of Cue.

A lot of workflows are moving to Kubernetes, and this also includes CI systems. In the previous months, several projects have been created to use Kubernetes as a CI engine. Kubernetes is used to run CI jobs, but it is also used to implement the CI itself: CI pipelines are implemented with Custom Resources (CRs) and controllers. CRs allow the user to manipulate CI resources through the Kubernetes API. Controllers are in charge of translating these resources into concrete actions, such as running jobs in a correct order.

From the user perspective, creating CI pipelines consists on creating Kubernetes resources which are generally YAML (or JSON) files. If the user wants to reuse existing jobs, to compose them, a templating or configuration language would be required. Helm, Nix, JSonnet, Dhall… could be used, as well as Cue. But, Cue has some specific features I will explore in the following

A important piece of a CI is the web service in charge of managing webhooks. For instance, when a user creates a pull request, GitHub sends a HTTP request thanks to a preconfigured webhook. A webhook is handled by a webserver which creates CI resources based on the request payload (JSON data).

So, the webhook web service has to

  1. Define Kubernetes resources (the pipeline) from a GitHub event payload. This can be viewed as a pure function from JSON to JSON.

  2. Deploy these resources in the cluster. This is impure since the Kubernetes environment is modified. Note this impure part is trivial since the Kubernetes API is declarative: all the runtime complexity is managed by Kubernetes.

Cue is really interesting in this context because the evaluation is pure (no builtin to get environment variables for instance), but it has a scripting layer allowing impure operations.

Let’s see how this webhook server could be implemented with Cue.

// This is the content of the file ci.cue

// The payload sends by GitHub contains a eventType attribute which is
//either "pull-request-create" or "pull-request-close".
payload: {
  eventType: "pull-request-create" | "pull-request-close"
  eventID: int
}

// The skeleton of the Kubernetes we want to create when a event is received
taskrunSkel: {
  kind: "TaskRun"
  apiVersion: "tekton.dev/v1alpha1"
  // Here should should come the task specification (jobs, steps,...)
}

// The description of a pull-request-create event with its assiociated resources.
// The resource is a just Tetkon taskrun containing the ID of the event in its name.
eventCreate: {
  eventType: "pull-request-create"
  taskrun: taskrunSkel & {
      metadata name: "taskrun-pull-request-create-\(payload.eventID)"
}

// The description of a pull-request-close event with its associated resources
eventClose: {
  eventType: "pull-request-close"
  taskrun: taskrunSkel & {
      metadata name: "taskrun-pull-request-close-\(payload.eventID)"
}

// | is the disjonction operator. A | B creates a struct which is either A or B.
// & is the unification operator. A & B creates a struct satisfying constraints of A and B.
// For better definitions: https://github.com/cuelang/cue/blob/master/doc/ref/spec.md#unification
// The & operator is used here as a kind of selector.
kubernetes: payload & (eventCreate | eventClose)

Now let’s imagine a CI event coming from GitHub:

// This is the content of the file payload.cue
// This file is created by the webhook webserver

payload: {
  eventType: "pull-request-create"
  eventID: 42
}

You can now evaluate these two files:

$ cue eval ci.cue payload.cue
kubernetes: {
    eventType: "pull-request-create"
    taskrun: {
        kind:       "TaskRun"
        apiVersion: "tekton.dev/v1alpha1"
        metadata: {
            name: "taskrun-create-stage1-42"
        }
    }
}

Ok, cool. We have a Kubernetes resource (kubernetes.taskrun) to run a task for this specific GitHub event. This is the pure part of the webhook. Note also we use the Cue validation mecanism to “pattern match” structures, in order to generate a resource adapted to the source event.

The next step (impure) consists of creating resource in a cluster. This can be implemented with the scripting layer of Cue. In the following, we define a custom Cue command apply which run kubectl apply on the resource previously created.

// This is the content of the file apply_tool.cue
command: apply: {
 task: run: exec.Run & {
     cmd: "kubectl apply --dry-run -f -"
     stdin:  yaml.MarshalStream([kubernetes.taskrun])
}

This command is then exposed by the Cue CLI and we can run it:

$ cue cmd apply
taskrun.tekton.dev/taskrun-create-stage1 created (dry run)

To summurize, we have defined a Cue command to generate and deploy Kubernetes resources based on a GitHub event. To really implement the webhook web service, we have to imagine a webserver that generates a ci structure from the event request payload, and runs a Cue module (defined by the user) to create and deploy resources.

Bonus usecase: admission controller

Kubernetes uses RBAC (Role based access control) but this is sometimes not powerful enough. To implement more complex rules, it is possible to deploy an admission controller: on a resource creation, the Kubernetes API server forwards the creation request to an admission controller which can decide to allow or not this request. Basically, an admission controller is just a webhook webserver that validates a submitted JSON based on rules defined by the user. This is exactly the “pure part” of the previously presented webserver. Note Gatekeeper is a project doing this with Rego.