Containers
Introducing cdk8s+: Intent-driven APIs for Kubernetes objects
At AWS, we’ve been exploring new approaches of making it easier to define Kubernetes applications. Last month, we announced the alpha release of cdk8s, an open-source project that enables you to use general purpose programming languages to synthesize manifests.
Today, I would like to tell you about cdk8s+ (cdk8s-plus), which we believe is the natural next step for this project.
cdk8s+ is a library built on top of cdk8s. It is a rich, intent-based class library for using the core Kubernetes API. It includes hand crafted constructs that map to native Kubernetes objects, and expose a richer API with reduced complexity.
To give you an idea of what I mean, here is how you’d define a Deployment
and expose it on port 8000 via a Service
:
const deployment = new kplus.Deployment(this, 'MyApp', {
spec: {
replicas: 3,
podSpecTemplate: {
containers: [new kplus.Container({
image: 'node',
port: 9000
})],
},
},
});
// this will internally create a service
deployment.expose({port: 8000});
Notice how we didn’t have to configure any selectors, nor did we have to specify the internal port used by the container when exposing our deployment. The snippet above will generate the following YAML manifest:
kind: Deployment
apiVersion: apps/v1
spec:
replicas: 3
selector:
matchLabels:
cdk8s.deployment: MyAppC6A88652
template:
metadata:
labels:
cdk8s.deployment: MyAppC6A88652
spec: <pod-spec-ommitted-for-brevity>
---
kind: Service
apiVersion: v1
spec:
type: ClusterIP
ports:
- port: 8000
targetPort: 9000 # this is the port exposed by container.
selector:
cdk8s.deployment: MyAppC6A88652
Later on, we will dive deeper into the API and the considerations we made building it.
cdk8s+ is in alpha
Note: the library is in very early stages of development. As such, it may be lacking substantial features as well as introduce breaking changes between updates. Use it with care and at your own discretion.
All breaking and non breaking changes will be published in the CHANGELOG.
Getting started
Head over to our GitHub repo and try it out. You’ll find documentation for all the available constructs, as well as a full API spec. We would love to hear what you think is missing, and if you so choose, actively participate in the development.
The library is currently available for Typescript and Python, with more languages coming soon. Also note that the generated manifests are completely agnostic to the cloud provider you are using. It produces pure Kuberenetes files that can be applied to any cluster.
Diving deep
This blog will show you how to deploy a real world Kubernetes application using cdk8s+. In order to get a complete picture of its benefits, we will actually develop the application in 3 different ways; First by directly authoring a YAML manifest, then by using a programming language (Typescript) and cdk8s to generate a manifest, and finally we will use the rich API provided by cdk8s+.
Before we start, let’s introduce a few guiding principles that will help us navigate through the different approaches.
- Desired State: We’d like our application definition to be solely based on a desired state configuration. This is necessary in order to apply infrastructure-as-code best practices, as well as enable GitOps workflows.
- Don’t Repeat Yourself (DRY): Avoid having to repeat any value or definition in multiple locations. This makes our desired state much less sensitive to change.
- Boilerplate: Information that can be inferred, should be inferred. Having to repeatedly apply common configurations makes our application overly complex and more error prone.
- Cognitive Load: Ideally, we should be able to write our application without exactly remembering how to configure each resource. We want the tools to guide us.
- Reusability: Once our application is done, we’d like for it to be easy to share our work with others.
We will go back to these guidelines throughout this post, and see how each approach addresses them.
Okay, we are now ready to get started. First, lets describe our application:
Construct Catalog Search
Those of your familiar with the constructs Ecosystem, might have already encountered awscdk.io. It’s a website for discovering constructs and is maintained as an open-source project at https://github.com/construct-catalog/catalog. Today, the catalog simply posts a tweet every time a new CDK construct is published. It then uses Twitter itself as somewhat of a search engine.
If you’re looking for information on how to publish your own construct library, check out Publishing Modules.
We’d like the catalog to provide a “real” search experience with filtering and aggregation capabilities. To do that, every time a new library is published, we are going to index an event to an Elasticsearch cluster, using an Amazon SQS queue in the middle. In addition, we will expose an endpoint that will perform a free text ES query.
So, our application has two components:
- Query Server: An http server accepting requests and performing Elasticsearch queries. (query.js)
- Indexer Worker: A long running poller process that fetches messages from the queue and indexes them to Elasticsearch. (indexer.js)
As inputs, our app will accept a QUEUE_URL
and an ELASTICSEARCH_ENDPOINT
env variable.
Note that the Elasticsearch cluster is actually created with cdk8s as well using a CRD. The code is available here: elasticsearch.ts.
Assuming the application code has already been written, we now want to deploy it to a Kubernetes cluster.
Construct Catalog Search: Using YAML
Like we mentioned, we will first write pure k8s YAML.
To get my application code inside a container, I will try and embed it inside a ConfigMap,
and later configure my pod to use that ConfigMap
kind: ConfigMap
metadata:
name: query-config-map
apiVersion: apps/v1
data:
query.js: // hmm...
As you can see, I’ve hit my first snag: how do I get my code from query.js to the manifest file?
Kubectl to the rescue
kubectl
has native support for creating ConfigMap
data from files:
❯ kubectl create configmap query-config-map --from-file=./query.js configmap/query-config-map created
The next step is to create a Deployment
that deploys our query server.
kind: Deployment
metadata:
name: query-deployment
apiVersion: apps/v1
spec:
replicas: 3
template:
spec:
volumes:
- name: query-app-volume
configMap:
name: query-configmap
containers:
- name: 'query'
image: 'node:12.18.0-stretch'
ports:
- containerPort: 8080
command: ["node", "query.js"]
env:
- name: ELASTICSEARCH_ENDPOINT
value: https://my.elasticsearch.cluster:9200/
workingDir: /root
volumeMounts:
- mountPath: /root
name: query-app-volume
So far, I defined a Deployment
with a replica count of 3 and specified a pod template. My pods will include a ConfigMap
based Volume
, that will be mounted to /root
.
Let’s apply it to the cluster and see what happens.
❯ kubectl apply -f manifest.yaml error validating data: ValidationError(Deployment.spec): missing required field "selector" in io.k8s.api.apps.v1.DeploymentSpec;
Whoops, of course, I forgot to apply selectors so that the deployment will be able to find its pods.
This does beg the question: how do I make sure to keep the selectors in sync with the labels? Also, since the pod spec is defined in the scope of a deployment, it makes sense for the deployment to simply select all the pods it created.
Ok, lets add selectors and re-apply:
kind: Deployment
metadata:
name: query-deployment
apiVersion: apps/v1
spec:
replicas: 3
selector:
matchLabels:
app: query # instruct the deployment to select pods with this label
template:
metadata:
labels:
app: query # apply a label to the pods
spec:
volumes:
- name: query-app-volume
configMap:
name: query-configmap
containers:
- name: 'query'
image: 'node:12.18.0-stretch'
ports:
- containerPort: 8080
command: ["node", "query.js"]
env:
- name: ELASTICSEARCH_ENDPOINT
value: https://my.elasticsearch.cluster:9200/
workingDir: /root
volumeMounts:
- mountPath: /root
name: query-app-volume
❯ kubectl apply -f manifest.yaml [14:37:33] deployment.apps/query-deployment created
Looks okay, lets check out our pods:
❯ kubectl get -A pods NAMESPACE NAME READY STATUS RESTARTS AGE default query-deployment-6576f6f795-4kttz 0/1 ContainerCreating 0 2m3s default query-deployment-6576f6f795-dqmwd 0/1 ContainerCreating 0 2m3s default query-deployment-6576f6f795-f2lqr 0/1 ContainerCreating 0 2m3s
We see 3 pods indeed, but for some reason they have been in ContainerCreating
status for a long time. Let’s inspect one of them:
❯ kubectl describe pod query-deployment-6576f6f795-4kttz .... .... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 61s default-scheduler Successfully assigned default/query-deployment-6576f6f795-4kttz to kind-control-plane Warning FailedMount 29s (x7 over 61s) kubelet, kind-control-plane MountVolume.SetUp failed for volume "query-app-volume" : configmap "query-configmap" not found
Uh oh, did you spot the typo? We used query-configmap
instead of query-config-map
as the ConfigMap
name.
This begs another question: since my
ConfigMap
is created out of band, how do I keep these values in sync?
Okay, let’s fix that and reapply:
kind: Deployment
metadata:
name: query-deployment
apiVersion: apps/v1
spec:
replicas: 3
selector:
matchLabels:
app: query
template:
metadata:
labels:
app: query
spec:
volumes:
- name: query-app-volume
configMap:
name: query-config-map # this was our typo
containers:
- name: 'query'
image: 'node:12.18.0-stretch'
ports:
- containerPort: 8080
command: ["node", "query.js"]
env:
- name: ELASTICSEARCH_ENDPOINT
value: https://my.elasticsearch.cluster:9200/
workingDir: /root
volumeMounts:
- mountPath: /root
name: query-app-volume
❯ kubectl apply -f manifest.yaml && sleep 10 && kubectl get -A pods [14:59:36] deployment.apps/query-deployment configured NAMESPACE NAME READY STATUS RESTARTS AGE default query-deployment-6d95544db6-57rz2 1/1 Running 0 18s default query-deployment-6d95544db6-bz4qx 1/1 Running 0 15s default query-deployment-6d95544db6-qf6br 1/1 Running 0 16s
Cool, all seems to be in order!
The final step is to expose the pods as a service, so that they can be queried through a single network address. For the sake of simplicity, I’ll use a ClusterIP
service, which is also the Kubernetes default.
kind: Service
apiVersion: v1
metadata:
name: query-service
spec:
type: ClusterIP
ports:
- port: 8000
targetPort: 8080 # damn, another duplication
selector:
app: query # lets not forget the selector this time...
---
kind: Deployment
metadata:
name: query-deployment
apiVersion: apps/v1
spec:
replicas: 3
selector:
matchLabels:
app: query
template:
metadata:
labels:
app: query
spec:
volumes:
- name: query-app-volume
configMap:
name: query-config-map
containers:
- name: 'query'
image: 'node:12.18.0-stretch'
command: ["node", "query.js"]
env:
- name: ELASTICSEARCH_ENDPOINT
value: https://my.elasticsearch.cluster:9200/
ports:
- containerPort: 8080
workingDir: /root
volumeMounts:
- mountPath: /root
name: query-app-volume
One more question: How do I make sure the selector the service uses matches the label of the pods? Same thing for the target port, it has to be the same as the container port.
❯ kubectl apply -f manifest.yaml [15:05:16] service/query-service created deployment.apps/query-deployment unchanged
If I now port-forward 8000 on my machine, I should get a response from the pods.
Great, the query application is working. All that’s left to do is add the indexer. The indexer specification is basically the same as the query, except it is not exposed by a service. Eventually, we end up with this full manifest file:
kind: Deployment
metadata:
name: query-deployment
apiVersion: apps/v1
spec:
replicas: 3
selector:
matchLabels:
app: query
template:
metadata:
labels:
app: query
spec:
volumes:
- name: query-app-volume
configMap:
name: query-config-map
containers:
- name: 'query'
image: 'node:12.18.0-stretch'
ports:
- containerPort: 8080
command: ["node", "query.js"]
env:
- name: ELASTICSEARCH_ENDPOINT
value: https://my.elasticsearch.cluster:9200/
workingDir: /root
volumeMounts:
- mountPath: /root
name: query-app-volume
kind: Deployment
metadata:
name: indexer-deployment
apiVersion: apps/v1
spec:
replicas: 1
selector:
matchLabels:
app: indexer
template:
metadata:
labels:
app: indexer
spec:
volumes:
- name: indexer-app-volume
configMap:
name: indexer-config-map
containers:
- name: 'indexer'
image: 'node:12.18.0-stretch'
command: ["node", "indexer.js"]
env:
- name: ELASTICSEARCH_ENDPOINT
value: https://my.elasticsearch.cluster:9200/
- name: QUEUE_URL
value: https://sqs.us-east-1.amazonaws.com/111111111/my-queue
workingDir: /root
volumeMounts:
- mountPath: /root
name: indexer-app-volume
---
kind: Service
apiVersion: v1
metadata:
name: query-service
spec:
type: LoadBalancer
ports:
- port: 8000
targetPort: 8080
selector:
app: query
Also, we have two auxiliary commands we need to run before applying this manifest:
❯ kubectl create configmap query-config-map --from-file=./query.js ❯ kubectl create configmap indexer-config-map --from-file=./indexer.js
Let’s recap, and specifically, focus on our guiding principles:
- ❌ Desired State: Unfortunately, we were unable to define our application solely using a desired state YAML manifest. We had to resort to external imperative
kubectl
commands. - ❌ Don’t Repeat Yourself (DRY): We’ve seen several occurrences of having to duplicate and match values across multiple locations in the manifest.
- ❌ Boilerplate: We have to explicitly apply selectors to the pod template and the deployment, even though the template is configured in the scope of the deployment, and can implicitly infer the selection labels. The same goes for pod spec volumes.
- ❌ Cognitive Load: Even a simple application such as ours, required us to have rather extensive Kubernetes skills. We had to know what selectors are, how to create config maps with
kubectl
, and how to mount them as volumes onto the container. - ❌ Reusability: Since deploying our app involves running some custom
kubectl
commands, sharing it with others becomes tricky. We need to come up with a non-standard packaging and distribution mechanism. Also, our application cannot accept any configuration values, since we don’t have the ability to dynamically generate a manifest.
Construct Catalog Search: Using cdk8s
Next up, we explore the possibility of authoring a manifest file using a general purpose programming language. This is enabled by cdk8s, and truly opens the world of programming languages to infrastructure definitions. We already know exactly what we need to do, so let’s write down the entire application:
import * as k8s from '../../imports/k8s';
import * as cdk8s from 'cdk8s';
import * as fs from 'fs';
const app = new cdk8s.App();
const chart = new cdk8s.Chart(app, 'Search');
const indexerSelectionLabels = { app: 'indexer' };
const querySelectionLabels = { app: 'query' };
const image = 'node:12.18.0-stretch';
const queryVolumeName = 'query-app-volume';
const indexerVolumeName = 'indexer-app-volume';
const queryPort = 8080;
const indexerConfigMap = new k8s.ConfigMap(chart, 'IndexerConfigMap', {
data: {
'indexer.js': fs.readFileSync(`${__dirname}/indexer.js`, 'UTF-8'),
},
})
const queryConfigMap = new k8s.ConfigMap(chart, 'QueryConfigMap', {
data: {
'query.js': fs.readFileSync(`${__dirname}/query.js`, 'UTF-8'),
},
})
new k8s.Deployment(chart, 'IndexerDeployment', {
spec: {
selector: {
matchLabels: indexerSelectionLabels,
},
template: {
metadata: {
labels: indexerSelectionLabels,
},
spec: {
volumes: [{
name: indexerVolumeName,
configMap: {
name: indexerConfigMap.name,
},
}],
containers: [{
name: 'indexer',
image: image,
command: [ 'node', 'indexer.js' ],
env: [
{
name: 'ELASTICSEARCH_ENDPOINT',
value: 'https://my.elasticsearch.cluster:9200/',
},
{
name: 'QUEUE_URL',
value: 'https://sqs.us-east-1.amazonaws.com/111111111/my-queue',
},
],
workingDir: '/root',
volumeMounts: [{
mountPath: '/root',
name: indexerVolumeName,
}],
}],
},
},
},
})
new k8s.Deployment(chart, 'QueryDeployment', {
spec: {
selector: {
matchLabels: querySelectionLabels,
},
template: {
metadata: {
labels: querySelectionLabels,
},
spec: {
volumes: [{
name: queryVolumeName,
configMap: {
name: queryConfigMap.name,
},
}],
containers: [{
name: 'query',
image: image,
ports: [{
containerPort: queryPort,
}],
command: [ 'node', 'query.js' ],
env: [
{
name: 'ELASTICSEARCH_ENDPOINT',
value: 'https://my.elasticsearch.cluster:9200/',
},
],
workingDir: '/root',
volumeMounts: [{
mountPath: '/root',
name: queryVolumeName,
}],
}],
},
},
},
})
new k8s.Service(chart, 'QueryService', {
spec: {
selector: querySelectionLabels,
type: 'LoadBalancer',
ports: [{
port: 80,
targetPort: queryPort,
}],
},
})
The API itself is basically a mirror of the YAML definition, but since we are now writing code, let’s see where we stand with our guiding principles:
- ✅ Desired State: We no longer need any external
kubectl
commands. Getting our application code into the manifest is done by simply usingfs.readFileSync.
- ✅ Don’t Repeat Yourself (DRY): Any duplicate value is defined once, as a constant, and is reused when needed.
- ❌ Boilerplate: Unfortunately, we still need to remember to apply selectors and configure pod spec volumes, even though this information can be inferred.
- ❌ Cognitive Load: We haven’t solved this problem. We still require the same set of Kubernetes skills to write this application.
- ✅ Reusability: We have two options here. 1) Publish a self-contained YAML manifest generated by running
cdk8s synth
. 2) Publish an NPM library that may or may not accept configuration values, and delegate the manifest generation to our users. Both ways are standard and simple.
In our next approach, I’ll show you how to address the two remaining principles through an approach called Intent-driven Design. By focusing on user intent, rather than on system mechanics, we can perform many operations on the user’s behalf, thus greatly reducing cognitive load and boilerplate definitions.
Construct Catalog Search: Using cdk8s+
Just like before, I’ll start by creating a ConfigMap
that will contain our source code.
import * as kplus from 'cdk8s-plus';
// create a ConfigMap construct.
const queryConfigMap = new kplus.ConfigMap(this, 'QueryConfigMap');
A quick look at the API the kplus.ConfigMap
construct offers, reveals the addFile()
method. This conveys our intent of embedding a file in a ConfigMap
. It essentially simulates the external kubectl
command we used before.
Let’s use it:
queryConfigMap.addFile(`${__dirname}/query.js`);
I now need to create a Volume
from that ConfigMap
. So I use the Volume.fromConfigMap()
function:
All I need to do now is create the container, and use its mount()
method.
const queryContainer = new kplus.Container({ image: 'node:12.18.0-stretch', command: [ 'node', 'query.js' ], env: { ELASTIC_ENDPOINT: kplus.EnvValue.fromValue('https://my.elasticsearch.cluster:9200/'), }, }); queryContainer.mount('/root', kplus.Volume.fromConfigMap(queryConfigMap));
Next up, I’ll create a Deployment
that will deploy 3 instances of this container. Just like before, we create a deployment:
const queryDeployment = new kplus.Deployment(this, 'QueryDeployment, {
spec: {
replicas: 3,
podSpecTemplate: {
containers: [queryContainer],
},
},
});
But this Deployment
is a bit different from its cdk8s
counterpart. It’s based on user intent, to understand what this means, let’s look at an excerpt from the manifest that cdk8s+ will synthesize for this deployment:
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
selector:
matchLabels:
cdk8s.deployment: IndexerDeploymentC6A88652
template:
metadata:
labels:
cdk8s.deployment: IndexerDeploymentC6A88652
You can see that the cdk8s.deployment
selection label was automatically added. This is the Deployment
construct interpreting our intent, which is that we want this deployment to create and control pods defined by the template
property.
We now want to expose these 3 pods (i.e the deployment) through a single network address. The Deployment
construct offers an API to do just that:
queryDeployment.expose({port: 8000});
Again, notice what we didn’t have to do:
- We didn’t have to specify any selectors.
- We didn’t have to specify the container port.
Internally, this method will create a Service
of type ClusterIP
, and apply the correct selectors and ports. This is possible because the deployment already has all this information, and cdk8s+ implicitly uses it on my behalf. If I add the indexer deployment, the full cdk8s+ application definition looks like so:
import * as kplus from 'cdk8s-plus';
import * as cdk8s from 'cdk8s';
const app = new cdk8s.App();
const chart = new cdk8s.Chart(app, 'Search');
const image = 'node:12.18.0-stretch';
const elasticEndpoint = 'https://my.elasticsearch.cluster:9200/';
const queryConfigMap = new kplus.ConfigMap(this, 'QueryConfigMap');
queryConfigMap.addFile(`${__dirname}/query.js`);
const indexerConfigMap = new kplus.ConfigMap(this, 'IndexerConfigMap');
indexerConfigMap.addFile(`${__dirname}/indexer.js`);
const queryContainer = new kplus.Container({
image: image,
command: [ 'node', 'query.js' ],
env: {
ELASTIC_ENDPOINT: kplus.EnvValue.fromValue(elasticEndpoint)
}
});
const indexerContainer = new kplus.Container({
image: image,
command: [ 'node', 'indexer.js' ],
env: {
ELASTIC_ENDPOINT: kplus.EnvValue.fromValue(elasticEndpoint),
QUEUE_URL: kplus.EnvValue.fromValue('https://sqs.us-east-1.amazonaws.com/111111111/my-queue')
}
});
queryContainer.mount('/root', kplus.Volume.fromConfigMap(queryConfigMap));
indexerContainer.mount('/root', kplus.Volume.fromConfigMap(indexerConfigMap));
const queryDeployment = new kplus.Deployment(chart, 'QueryDeployment, {
spec: {
replicas: 3,
podSpecTemplate: {
containers: [queryContainer]
}
}
});
queryDeployment.expose({port: 8000, type: kplus.ServiceType.NODE_PORT});
new kplus.Deployment(chart, 'IndexerDeployment, {
spec: {
replicas: 3,
podSpecTemplate: {
containers: [indexerContainer]
}
}
});
Going back to our guiding principles now, lets see where we’re at:
- ✅ Desired State: Nothing has changed here, we still don’t need any external
kubectl
commands. And in-fact, we don’t even need to explicitly usereadFileSync
, since cdk8s+ will do that for us. - ✅ Don’t Repeat Yourself (DRY): Still good, our use of a programming language eliminates this issue.
- ✅ Boilerplate: This code embodies the minimal amount of configuration needed to correctly deploy our application. All redundant information, such as selectors and pod spec volumes, is implicitly inferred.
- ✅ Cognitive Load: We managed to greatly reduce the cognitive load since we were guided by intent based API’s. These API’s alleviate some of the skills needed to interact with Kubernetes resources.
- ✅ Reusability: Same as before, we either publish an NPM package or a generated YAML manifest (or both).
Summary
We started with a multitude of issues that arise from the limitations of YAML. We saw many of those issues disappear when we used cdk8s to rewrite our YAML definition in a programming language. We also saw that simply using a programming language was not enough, as it still carried a rather high cognitive load on the developer. We then started using the intent based APIs provided by cdk8s+ and saw much of that load go away. Here is a recap of how well each approach addressed our guiding principles:
Head over to our GitHub repo to try cdk8s+. We want to hear about as many use cases as possible and develop the library alongside the community. We also invite you to join the discussion on our Slack channel and on Twitter (#cdk8s#cdk8s+).
Happy coding!