When you deploy new versions of your applications, you should use a strategy that minimizes potential impact on your users. One option for this is blue/green deployments. You can implement blue/green deployments in Kubernetes using plain Kubernetes objects like services and deployments. Necessary steps can be orchestrated using Terraform.
In this blog post, we will learn about implementing blue/green deployments in Kubernetes with Terraform.
What we’ll cover:
Kubernetes is the industry standard for container orchestration. It manages your applications running as containers on a cluster, which consists of the Kubernetes control plane and worker nodes, where your containers are scheduled and placed.
Managing workloads on Kubernetes is declarative in nature. You tell the Kubernetes cluster what you want, and it is up to the cluster to bring your desired state to reality. This is similar to how Terraform works, and the two technologies work well together.
Kubernetes is open-source and supported by the Linux Foundation and the Cloud Native Computing Foundation (CNCF). It was first released in 2014 and was inspired by a similar technology called Borg, which Google had been running in their production environment for some years.
Today, Kubernetes is one of the most popular platforms for running applications at scale, and it is likely to continue to have this role for some time ahead. Look at the cloud-native landscape from the CNCF to get an idea of the vast number of tools available in the Kubernetes ecosystem.
Starting with Kubernetes from scratch is a significant undertaking. Running your own Kubernetes clusters at scale is a challenge. However, there are managed Kubernetes offerings from all major cloud providers. Amazon Web Services has the Elastic Kubernetes Service (EKS), Google Cloud has Google Kubernetes Engine (GKE), and Microsoft Azure has Azure Kubernetes Service (AKS).
Using a managed service simplifies getting started with Kubernetes and allows you to focus more on the applications you run instead of cluster management.
How to manage Kubernetes using Terraform?
Kubernetes exposes an API that allows you to manage all aspects of the Kubernetes environment. The existence of an API usually also means a Terraform provider is targeting this API. In this case, there is a Kubernetes provider for Terraform and a Helm provider.
In this blog post, we will focus on the Kubernetes provider.
As with any other provider, you must specify that you will use the Kubernetes provider in your Terraform configuration. You do this in the required_providers
block inside of your terraform
block:
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.37.1"
}
}
}
You must also configure the Kubernetes provider with credentials and details of your target cluster using Terraform.
For the example in this blog post, we will use a local Kubernetes cluster using minikube. We will configure the provider to use my local Kubernetes configuration directory for authentication and cluster details:
provider "kubernetes" {
config_path = "~/.kube/config"
}
The provider offers many additional details on how to configure it. To discover all possibilities, read the documentation for the Kubernetes provider.
Blue/green deployments use two identical environments: blue (live) and green (staging or new version). Traffic initially routes to blue. After deploying the new version to green and validating it, traffic is switched to green. If issues arise, rollback is fast by redirecting traffic back to blue.
In a blue/green deployment, you will have two environments deployed side by side. Only one environment will be receiving production traffic at any given time. The initial environment is the blue environment that runs your current production version of your application.
A new version of your application is deployed, and the current version serves production traffic. This new version is the green environment. At first, no production traffic is sent to the green environment. You can run tests targeting the new version in the green environment without any fear of affecting your production users.
Once you are satisfied with the new version of your application, you can switch the production traffic from the blue to the green environment.
After this switch has been performed, you should keep the blue environment available for some time until you are confident that the new green environment does not cause any issues when it receives the production traffic. If any issues would appear in your new green environment, you can switch the traffic back to target the blue environment.
Blue/green deployments are attractive because they allow users to test the new version in the production environment without sending any production traffic to it and quickly roll back in case of issues.
A similar deployment strategy is called canary deployments.
The idea is similar to blue/green deployments, but in the canary deployment, you would start by allowing a small subset of traffic to reach the new version of your application. Then, you would slowly increase the percentage of traffic sent to the new version until you reach 100%.
Throughout the process, you will monitor the new application version’s behavior and act if any of the metrics indicate an issue.
Read more: 8 Different Types of Kubernetes Deployment Strategies
In the following example, we will look at implementing blue/green deployments in Kubernetes with Terraform. Note that there are third-party tools you can use to implement and orchestrate blue/green deployments.
However, in the following, we will only use native Kubernetes objects (deployments and services) to achieve this.
Step 1. Prepare a Kubernetes cluster
As mentioned earlier, we will be using a local Kubernetes cluster using Minikube on my laptop.
If you want to follow along, you can read the Minikube documentation for how to install and get started with this platform.
The following steps will be the same for any Kubernetes cluster type you use. However, details about how to access the applications running on minikube differ from those of other types of Kubernetes clusters.
Step 2. Deploy the initial application version (blue)
We start by preparing the initial application version.
In this demo, we will use a simple Python web application that runs a Flask web server. The application has a single path (the root path ”/”) that responds with a static message.
Place the following Python code in a file named app.py:
from flask import Flask
app = Flask(__name__)
@app.route("/", methods=["GET"])
def home():
return "Spacelift App V1"
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=5000)
There are two important details to note about this application:
- A request sent to the root path “/” will respond with a static message of “Spacelift App V1”.
- The application listens on port 5000.
Since we are working with Python, we should create a requirements.txt file where we specify the application dependencies that must be installed for this application:
Flask
We need to package this application into a Docker image that we can run as a container on our Kubernetes cluster. Create a new file named Dockerfile next to the Python application files and place the following contents in it:
FROM python:3.9-slim-buster
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 5000
CMD ["python", "app.py"]
In your terminal, go to the directory where you have the files you just created and build the Docker image (we are using Docker, but other similar tools would also work, e.g., Podman):
$ docker build -t spacelift-app:v1 .
Here we give the image a tag of v1 to indicate that this is version 1 of our application.
We are now ready to provision our application to Kubernetes using Terraform.
Create a new file named main.tf where all our Terraform configuration will go. Use the terraform
and provider
configuration blocks shown earlier in this blog post.
We begin by adding a Kubernetes namespace named spacelift where this application and all related resources will go:
resource "kubernetes_namespace" "spacelift" {
metadata {
name = "spacelift"
}
}
In this namespace, we will first create a Kubernetes service which will be the entrypoint for the application:
resource "kubernetes_service" "default" {
metadata {
name = "spacelift-app"
namespace = kubernetes_namespace.spacelift.metadata[0].name
}
spec {
type = "NodePort"
selector = {
app = "spacelift-app-v1"
}
port {
port = 80
target_port = 5000
}
}
}
Notice how the Kubernetes service uses the selector app = "spacelift-app-v1"
. It listens on port 80 and forwards the traffic to the target port 5000 that our application listens on.
Finally, we add the deployment resource for version 1 of the application:
resource "kubernetes_deployment" "v1" {
metadata {
name = "spacelift-app-v1"
namespace = kubernetes_namespace.spacelift.metadata[0].name
}
spec {
replicas = 3
selector {
match_labels = {
app = "spacelift-app-v1"
}
}
template {
metadata {
labels = {
app = "spacelift-app-v1"
}
}
spec {
container {
name = "app"
image = "spacelift-app:v1"
image_pull_policy = "Never"
}
}
}
}
}
Note the following details of the deployment:
- The application is run in three instances,
replicas = 3
. - It has the required label
app = "spacelift-app-v1"
that the Kubernetes service is looking for. - It uses the
spacelift-app:v1
image that we built earlier.
We have configured the image pull policy to be Never
because we want to use the local image that we’ve built on my laptop. You should use a policy that is appropriate in your context.
To use minikube with local Docker images you will also need to set a number of environment variables for it to work. You can set the required environment variables using the following command:
$ eval $(minikube docker-env)
We are now ready to provision version 1 of our application.
Run through terraform init
, terraform plan
, and terraform apply
to provision the resources to the Kubernetes cluster. Once the apply is complete we can verify that the application is up and running:
$ kubectl get pods --namespace spacelift --show-labels
NAME READY STATUS RESTARTS AGE LABELS
spacelift-app-v1-585ffbb5b-9krzw 1/1 Running 0 61s app=spacelift-app-v1
spacelift-app-v1-585ffbb5b-bwspr 1/1 Running 0 61s app=spacelift-app-v1
spacelift-app-v1-585ffbb5b-nnzvj 1/1 Running 0 61s app=spacelift-app-v1
We can also verify that the Kubernetes service has been created:
$ kubectl get services -n spacelift
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
spacelift-app NodePort 10.98.41.218 <none> 80:30971/TCP 2m5s
To access the application through the Kubernetes service in our local browser we can use the following minikube command:
$ minikube service spacelift-app -n spacelift --url
http://127.0.0.1:60671
The command outputs a localhost web address. Open this address in your browser to see the application:
We are happy with what we have achieved so far!
Step 3. Deploy the updated application version (green)
After some time, we realized we could make the message returned from the application look fancier.
We plan to release a new version of the application that includes a spaceship in the message. However, we want to make sure we test the application thoroughly before we send our production traffic to it. This is a great use case for a blue/green deployment strategy.
We begin by updating the application source code in app.py to include the required change:
from flask import Flask
app = Flask(__name__)
@app.route("/", methods=["GET"])
def home():
return "Spacelift App V2 🚀"
if __name__ == "__main__":
app.run(debug=True, host="0.0.0.0", port=5000)
With the new Python code in place, we build the new application version as a Docker image using the same Dockerfile as before:
$ docker build -t spacelift-app:v2 .
This time the Docker image is tagged as v2.
In our Terraform code in main.tf we add a new deployment resource for the new application version:
resource "kubernetes_deployment" "v2" {
metadata {
name = "spacelift-app-v2"
namespace = kubernetes_namespace.spacelift.metadata[0].name
}
spec {
replicas = 3
selector {
match_labels = {
app = "spacelift-app-v2"
}
}
template {
metadata {
labels = {
app = "spacelift-app-v2"
}
}
spec {
container {
name = "app"
image = "spacelift-app:v2"
image_pull_policy = "Never"
}
}
}
}
}
Note the following details of the new deployment resource:
- This is a new resource, we do not edit the existing deployment resource. Both will coexist to begin with.
- The deployment uses the label
app = spacelift-app-v2
. - It uses the new container image
spacelift-app:v2
. - We create an equal number of replicas (3) for this deployment as for the one currently serving production traffic. This is an important detail since we will switch over all traffic at once after we are done testing the new version and we want it to be able to handle the traffic.
If needed, we could add a Kubernetes service resource dedicated to this deployment. An important detail at this point is that we have not yet updated the Kubernetes service resource we previously created to target the new deployment.
After going through another terraform plan
and terraform apply
we can verify that the new application version is up and running together with the version currently serving traffic:
$ kubectl get pods --namespace spacelift --show-labels
NAME READY STATUS RESTARTS AGE LABELS
spacelift-app-v1-585ffbb5b-9krzw 1/1 Running 0 7m40s app=spacelift-app-v1
spacelift-app-v1-585ffbb5b-bwspr 1/1 Running 0 7m40s app=spacelift-app-v1
spacelift-app-v1-585ffbb5b-nnzvj 1/1 Running 0 7m40s app=spacelift-app-v1
spacelift-app-v2-65cb9757db-2cnls 1/1 Running 0 88s app=spacelift-app-v2
spacelift-app-v2-65cb9757db-5srp8 1/1 Running 0 88s app=spacelift-app-v2
spacelift-app-v2-65cb9757db-x5n9g 1/1 Running 0 88s app=spacelift-app-v2
This is when we would run all required tests on the new version to determine if it behaves as intended. For instance, to verify that the updated message is returned, we can use the following command to directly access one of the new pods:
$ kubectl port-forward spacelift-app-v2-65cb9757db-x5n9g 5000:5000 \
--namespace spacelift
If we browse to http://127.0.0.1:5000 we see the new version:
Step 4. Switch the production traffic to the new (green) version
We are done testing the new application version and are ready to switch the production traffic to this version instead of the current (blue) version.
Update the kubernetes_service
resource in main.tf to target the label for the new application version:
resource "kubernetes_service" "default" {
metadata {
name = "spacelift-app"
namespace = kubernetes_namespace.spacelift.metadata[0].name
}
spec {
type = "NodePort"
selector = {
app = "spacelift-app-v2" # update this
}
port {
port = 80
target_port = 5000
}
}
}
Run through terraform plan
and terraform apply
to update the service resource.
If you still have the service exposed through minikube, you can monitor the deployment by updating the page until you see the new application version:
Remember not to delete the initial (blue) deployment yet. You want to be able to switch back to the old version in case you discover an issue with your new application version once production traffic is directed to it.
Step 5. Decommission the old application version
The new application version looks great, and we are ready to decommission the old application version.
Delete the initial deployment resource from main.tf:
# resource "kubernetes_deployment" "v1" {
# metadata {
# name = "spacelift-app-v1"
# namespace = kubernetes_namespace.spacelift.metadata[0].name
# }
# … details omitted …
# }
Run through one final terraform plan
and terraform apply
to remove the deployment from your Kubernetes cluster. Once completed, verify that you only have the new version running:
$ kubectl get pods --namespace spacelift --show-labels
NAME READY STATUS RESTARTS AGE LABELS
spacelift-app-v2-65cb9757db-2cnls 1/1 Running 0 15m app=spacelift-app-v2
spacelift-app-v2-65cb9757db-5srp8 1/1 Running 0 15m app=spacelift-app-v2
spacelift-app-v2-65cb9757db-x5n9g 1/1 Running 0 15m app=spacelift-app-v2
That was it, we have successfully implemented a blue/green deployment in Kubernetes with Terraform.
Best practices include:
Automate the blue/green deployment process
In this blog post, we manually stepped through the blue/green deployment to understand how it works.
In a production scenario, you should automate the process. All the steps that we did manually can be automated. The simplest approach is to use a single script that performs the necessary steps, but you can also use your normal workflows through a CI/CD system or IaC platform (e.g., Spacelift).
Automation increases the likelihood of successful deployments or automated rollbacks if required.
Implement continuous testing and improvements
You should continuously improve your deployment process and implement thorough testing to increase confidence in your blue/green deployments with Terraform.
If you encounter an issue during one of your deployments, you should analyze what went wrong and what you can do to prevent the same issue from happening again. This will quickly improve the deployment process and your deployment success rate.
The type of testing you implement for your blue/green deployments will vary based on context and the type of application you are deploying. At the very least, you should have a number of end-to-end tests that run through the typical user interactions that your application supports.
Prepare a rollback plan
Even if you run extensive testing on your new application version, chances are new surprises await when the full production traffic reaches it. To prepare for this, you should always have an automated rollback plan ready if something indicates an issue with the new application version once it has gone live.
How this rollback plan looks will depend on your context. In essence you should have the reverse change for the Kubernetes service (i.e. point the service back at the initial application version) ready and a terraform plan
and terraform apply
ready to revert the change if needed.
Define and measure a successful deployment
To know if a rollback should take place or not, you must understand what success looks like.
You should measure your application’s key performance indicators (KPIs) before and after the deployment. If any of these KPIs move in a negative direction, it would be a good idea to roll back to the blue environment.
Typical KPIs include response codes, response times, CPU and memory utilization, number of user clicks on a website, and more. This is where you need to apply your expertise to your context and applications.
Handle stateful applications with care
In this blog post, we saw an example of a blue/green deployment with Terraform for a stateless application. If you have a stateful application with a database or another data storage solution, you must proceed carefully when implementing blue/green deployments.
If the change you are implementing involves changes to how your data is stored in your data store, you might need to deploy it in multiple steps.
Imagine your app uses a PostgreSQL relational database. You have implemented a new feature in your application that involves changes to one of your database tables. The steps you should go through for a blue/green deployment in this situation are:
- Deploy the green version of the application
- Make a backward-compatible update to the database table. If a rollback is required, the blue version must be able to use the table as before
- Switch production traffic to the green version
- Decommission the blue version once you are satisfied with how the green version works with production traffic
- Potentially apply further changes to the database schema that are compatible with the green version (but not necessarily with the blue version)
For some changes, you should go through the steps outlined above multiple times, making incremental updates.
The steps above are specific to relational database technologies. The process could be simplified slightly if you use NoSQL technology.
Spacelift supports both Terraform and Kubernetes (as well as many other IaC tools) and enables users to create stacks based on them. Leveraging Spacelift, you can build CI/CD pipelines to combine them and get the best of each tool. This way, you will use a single tool to manage your Terraform and Kubernetes resources lifecycle, allow your teams to collaborate easily, and add some necessary security controls to your workflows.
You could, for example, deploy Kubernetes clusters with Terraform stacks and then, on separate Kubernetes stacks, deploy your containerized applications to your clusters. With this approach, you can easily integrate drift detection into your Kubernetes stacks and enable your teams to manage all your stacks from a single place.
To see why using Kubernetes and Terraform with Spacelift makes the most sense, check out this article. The code is available here.
If you want to learn more about Spacelift, create a free account or book a demo with one of our engineers.
Blue/green deployments are an excellent way to safely test a new application version before it is introduced to production traffic.
The steps to implement a blue/green deployment with Terraform are:
- Provision the initial (blue) application version including a Kubernetes namespace, service, and deployment. This requires a first
terraform apply
. - Provision the new (green) application version as a separate Kubernetes deployment with the same number of replicas as the initial version in the same namespace. This requires a second
terraform apply
. - After verifying that the new (green) version of the application works as intended, update the Kubernetes service resource to target the new Kubernetes deployment. This requires a third
terraform apply
. - Decommission the old (blue) application version once you have verified that the new (green) application version works with the production traffic. This requires the fourth and last
terraform apply
.
Successful blue/green deployments build on automation, thorough testing, and continuous improvement, having a rollback plan ready, and knowing how to measure what a successful deployment looks like.
For stateful applications, you may have to implement blue/green deployments using additional steps than those outlined above.
Note: New versions of Terraform are placed under the BUSL license, but everything created before version 1.5.x stays open-source. OpenTofu is an open-source version of Terraform that expands on Terraform’s existing concepts and offerings. It is a viable alternative to HashiCorp’s Terraform, being forked from Terraform version 1.5.6.
Terraform management made easy
Spacelift effectively manages Terraform state, more complex workflows, supports policy as code, programmatic configuration, context sharing, drift detection, resource visualization and includes many more features.