Kubernetes (often stylized “K8s”) won the battle of container orchestration tools years ago. Nevertheless, there are still many ways to implement Kubernetes today and make it work with various infrastructures, and many tools—some better maintained than others. Perhaps the most interesting development on that front, though, is that the top cloud providers have decided to release their own managed Kubernetes versions:
- Microsoft Azure offers the Azure Kubernetes Service (AKS)
- AWS offers the Amazon Elastic Kubernetes Service (EKS)
- Google Cloud offers the Google Kubernetes Engine (GKE)
From a DevOps perspective, what do these platforms offer? Do they live up to their promises? How do their creation time and other benchmarks compare? How well do they integrate with their respective platforms, especially their CLI tools? What’s it like maintaining and working with them? Below, we’ll delve into these questions, and more.
Note: For readers who would like the concepts of a Kubernetes cluster explained before they read on, Dmitriy Kononov offers an excellent introduction.
AKS vs. EKS vs. GKE: Advertised Features
We’ve decided to group the different features available for each managed Kubernetes version into silos:
- Global Overview
- Scalability and Performance
- Security and Monitoring
Note: These details may change over time as cloud providers regularly update their products.
|Latest Version||1.15.11 (default) - 1.18.2 (preview)||1.16.8 (default)||1.14.10 (default) - 1.16.9|
|Specific Components||oms-agent, tunnelfront||aws-node||fluentd, fluentd-gcp-scaler, event-exporter, l7-default-backend|
|Kubernetes Control Plane Upgrade||Manual||Manual||Automated (default) or manual|
|Worker Upgrades||Manual||Yes (easy with managed node groups)||Yes: automated and manual, fine-tuning possible|
|SLA||99.95 percent with availability zone, 99.9 percent without||99.9 percent for EKS (master), 99.99 percent for EC2 (nodes)||99.95 percent within a region, 99.5 percent within a zone|
|Native Knative Support||No||No||No (but native Istio install)|
|Kubernetes Control Plane Price||Free||$0.10/hour||$0.10/hour|
Kubernetes itself was Google’s project, so it makes sense that they were the first to propose a hosted version in 2014.
Of the three being compared here, Azure was next with AKS and has had some time to improve: If you remember acs-engine, which had been used to provision Kubernetes on Azure a few years ago, you will appreciate Microsoft’s effort on its replacement, aks-engine.
AWS was the last one to roll out its own version, EKS, so it sometimes can appear to be behind on the feature front, but they are catching up.
In terms of pricing, of course, things are always moving, and Google decided to join AWS in its price point of $0.10/hour, effective June 2020. Azure is the outsider here by giving out for free the AKS service, but it’s unclear how long that may last.
Another main difference lies in the upgrade feature of the cluster. The most automated upgrades are in GKE, and they are turned on by default. However, AKS vs. EKS are similar to each other here, in the sense that both require manual requests to be able to upgrade the master or worker nodes.
|Network Policies||Yes: Azure Network Policies or Calico||Need to install Calico||Yes: Native via Calico|
|Load Balancing||Basic or standard SKU load balancer||Classic and network load balancer||Container-native load balancer|
|Service Mesh||None out of the box||AWS App Mesh (based on Envoy)||Istio (out of the box, but beta)|
|DNS Support||CoreDNS customization||CoreDNS + Route53 inside VPC||CoreDNS + Google Cloud DNS|
On the network side of things, the three cloud providers are very close to each other. They all let customers implement network policies with Calico, for example. Concerning load balancing, they all implement their integration with their own load balancer resources and give engineers the choice of what to use.
The main difference found here is based on the added value of the service mesh. AKS does not support any service mesh out of the box (although engineers can manually install Istio). AWS has developed its own service mesh called App Mesh. Finally, Google has released its own integration with Istio (though still in beta) that customers can add directly when creating the cluster.
Best bet: GKE
Scalability and Performance
|Bare Metal Nodes||No||Yes||No|
|Max Nodes per Cluster||1,000||1,000||5,000|
|High Availability Cluster||No||Yes for control plan, manual across AZ for workers||Yes via regional cluster, master and worker are replicated|
|Auto Scaling||Yes via cluster autoscaler||Yes via cluster autoscaler||Yes via cluster autoscaler|
|Vertical Pod Autoscaler||No||Yes||Yes|
|On-prem||Available via Azure ARC (beta)||No||GKE on-prem via Anthos GKE|
Concerning GKE vs. AKS vs. EKS performance and scalability, GKE seems to be ahead. Indeed, it supports the biggest number of nodes (5,000) and offers extensive documentation on how to properly scale a cluster. All the features for high availability are available and are easy to fine-tune. What is more, GKE recently released Anthos, a project to create an ecosystem around GKE and its functionalities; with Anthos, you can deploy GKE on-prem.
AWS does have a key advantage, though: It is the only one to allow bare-metal nodes to run your Kubernetes cluster.
As of June 2020, AKS lacks high availability for the master, which is an important aspect to consider. But, as always, that could soon change.
Best bet: GKE
Security and Monitoring
|App Secrets Encryption||No||Yes, possible via AWS KMS||Yes, possible via Cloud KMS|
|Compliance||HIPAA, SOC, ISO, PCI DSS||HIPAA, SOC, ISO, PCI DSS||HIPAA, SOC, ISO, PCI DSS|
|RBAC||Yes||Yes, and strong integration with IAM||Yes|
|Monitoring||Azure Monitor container health feature||Kubernetes control plane monitoring connected to Cloudwatch, Container Insights Metrics for nodes||Kubernetes Engine Monitoring and integration with Prometheus|
In terms of compliance, all three cloud providers are equivalent. However, in terms of security, EKS and GKE provide another layer of security with their embedded key management services.
As for monitoring, Azure and Google Cloud provide their own monitoring ecosystem around Kubernetes. It’s worth noting that the one from Google has been recently updated to use Kubernetes Engine Monitoring, which is specifically designed for Kubernetes.
Azure provides its own container monitoring system, which was originally made for a basic, non-Kubernetes container ecosystem. They’ve added monitoring for some Kubernetes-specific metrics and resources (cluster health, deployments)—in preview mode, as of June 2020.
AWS offers lightweight monitoring for the control plane directly in Cloudwatch. To monitor the workers, you can use Kubernetes Container Insights Metrics provided via a specific CloudWatch agent you can install in the cluster.
Best bet: GKE
|Marketplace||Azure Marketplace (but no clear AKS integration)||AWS Marketplace (250+ apps)||Google Marketplace (90+ apps)|
|Infrastructure-as-Code (IaC) Support||Terraform module
|Documentation||Weak but complete and strong community (2,000+ Stack Overflow posts)||Not very thorough but strong community (1,500+ Stack Overflow posts)||Extensive official documentation and very strong community (4,000+ Stack Overflow posts)|
|CLI Support||Complete||Complete, plus special separate tool
In terms of ecosystems, the three providers have different strengths and assets. AKS now has very complete documentation around its platform and is the second in terms of posts on Stack Overflow. EKS has the least number of posts on Stack Overflow, but benefits from the strength of the AWS Marketplace. GKE, as the oldest platform, has the most posts on Stack Overflow, and a decent number of apps on its marketplace, but also the most comprehensive documentation.
Best bets: GKE and EKS
|Free Usage Cap||$170 worth||Not eligible for free tier||$300 worth|
|Kubernetes Control Plane Cost||Free||$0.10/hour||$0.10/hour (June 2020)|
|Reduced Price (Spot Instance/Preemptible Nodes)||Yes||Yes||Yes|
|Example Price for One Month||$342
3 D2 nodes
3 t3.large nodes
3 n1-standard-2 nodes
Concerning the price overall, even with GKE’s move to implement the $0.10/hour price point for any cluster, it remains by far the cheapest cloud. This is thanks to something specific to Google—sustained use discounts, which are applied whenever the monthly usage of on-demand resources meets a certain minimum.
It is important to note that the example price row doesn’t take into account the traffic to the Kubernetes cluster that the cloud provider can charge for.
The reason AWS doesn’t allow the use of their free tier to test an EKS cluster is that EKS requires bigger machines than the tX.micro tier, and EKS hourly pricing is not in the free tier.
Nevertheless, it can still be economical to test any of these managed Kubernetes options with a decent load using the spot/preemptible nodes of each cloud provider—that tactic will easily save 80 to 90 percent on the final price. (Of course, it is not recommended to run stateful production loads on such machines!)
Advertised Features and Google’s Advantage
When looking at the different advertised features online, it seems there is a correlation between how long the managed Kubernetes version has been on the market and the number of features. As mentioned, Google having been the initiator of the Kubernetes project seems to be an undeniable advantage, resulting in better and stronger integration with its own cloud platform.
But AKS and EKS are not to be underestimated as they mature; both can take advantage of their unique features. For example, AWS is the only one to have bare-metal node integration, and also boasts the highest number of applications in its marketplace.
Now that the advertised features for each Kubernetes offering are clear, let’s do a deeper dive with some hands-on tests.
Kubernetes: AWS vs. GCP vs. Azure in Practice
Advertising is one thing, but how do the different platforms compare when it comes to serving production loads? As a cloud engineer, I know the importance of how long it takes to spawn and to take down a cluster when enforcing infrastructure-as-code. But I also wanted to explore the possibilities of each CLI and comment on how easy (or not) each cloud provider makes it to spawn a cluster.
Cluster Creation User Experience
On AKS, spawning a cluster is similar to creating an instance in AWS. Just find the AKS menu and go through a succession of different menus. Once the config is validated, the cluster can be created, a two-step process. It’s very straightforward, and engineers can easily and quickly launch a cluster with the default settings.
Cluster creation is definitely more complex on EKS vs. AKS. First of all, and by default, AWS requires a trip to IAM first to create a new role for the Kubernetes control plane and assign the engineer to it. It is important to note as well that this cluster creation does not include the creation of the nodes, so when I measured 11 minutes on average, this is only for the master creation. The node group creation is another step for the administrator, again needing a role for workers with three necessary policies to be made via the IAM control panel.
For me, the experience of creating a cluster manually is most pleasant on GKE. After finding the Kubernetes Engine in the Google Cloud Console, click to create a cluster. Different categories of settings appear in a menu on the left. Google will prepopulate the new cluster with an easily modifiable default node pool. Last but not least, GKE has the fastest cluster-spawning time, which brings us to the next table.
Time to Spawn a Cluster
|Size||3 nodes (Ds2-v2), each having 2 vCPUs, 7 GB of RAM||3 nodes t3.large||3 nodes n1-standard-2|
|Time (m:ss)||Average 5:45 for a full cluster||11:06 for master plus 2:40 for the node group (totalling 13:46 for a full cluster)||Average 2:42 for a full cluster|
I performed these tests in the same region (Frankfurt and West Europe for AKS) to remove this difference’s possible impact on spawning time. I also tried to select the same size for nodes for the cluster: Three nodes, each having two vCPUs and seven or eight GB of memory, a standard size to run a small load on Kubernetes and start experimenting. I created each cluster three times to compute an average.
In these tests, GKE remained way ahead with a spawning time always under three minutes.
Kubernetes: AWS vs. GCP vs. Azure CLI Overview
Not all CLIs are created equal, but in this case, all three CLIs are actually modules of a larger CLI. What’s it like to get up and running with each cloud provider’s CLI toolchain?
AKS CLI (via
az tooling, then the AKS module (via
az aks install-cli), engineers need to authorize the CLI to communicate with the project’s Azure account. This is a matter of getting the credentials to update the local kubeconfig file via a simple
az aks get-credentials --resource-group myResourceGroup --name myAKSCluster.
Similarly, to create a cluster:
az aks create --resource-group myResourceGroup --name myAKSCluster
EKS CLI (via
On AWS, we find a different approach—there are two different official CLI tools to manage EKS clusters. As always,
aws can connect to AWS resources, particularly clusters. Getting credentials into a local kubeconfig can be done via:
aws eks update-kubeconfig --name cluster-test.
However, engineers can also use
eksctl, developed by Weaveworks and written in Go, to easily create and manage an EKS cluster. A major boon EKS provides for cloud engineers is that they can combine it with YAML configuration files to create infrastructure-as-code (IaC) since it’s working with CloudFormation. It’s definitely an asset to consider when integrating an EKS cluster into larger infrastructure on AWS.
Creating a cluster via
eksctl is as easy as
eksctl create cluster, no other parameters required.
GKE CLI (via
For GKE, the steps are very similar: Install
gcloud, then authenticate via
gcloud init. The possibilities from there: Engineers can create, delete, describe, get credentials for, resize, update, or upgrade a cluster, or list clusters.
The syntax to create a cluster with
gcloud is straightforward:
gcloud container clusters create myGCloudCluster --num-nodes=1
AKS vs. EKS vs. GKE: Test Drive Results
In practice, we can see that GKE is certainly the fastest to spin up a basic cluster, in terms of both console simplicity and cluster spawn time. UX-wise, with the connect button next to the cluster, making it the most straightforward to connect to a cluster, too.
In terms of CLI tooling, the three cloud providers have implemented similar functionalities; however, we can lay the stress on the extra tool provided by Weaveworks for EKS.
eksctl is the perfect tool for you to implement infrastructure-as-code on top of your preexisting AWS infrastructure, combining other services with EKS.
Managed Kubernetes Offerings Forge Ahead: AWS vs. GCP vs. Azure
For those just starting in the world of Kubernetes, the go-to implementation for me is GKE, since it’s the most straightforward. It’s easy to set up, it has a simple and fast UX for spawning, and it’s well-integrated into the Google Cloud Platform ecosystem.
Even though AWS was the last to join the race, it has a few undeniable advantages, such as bare metal nodes and the simple fact that it’s integrated with the provider with the largest mind-share.
Finally, AKS has made great progress since its creation. Tooling and feature parity likely won’t take long, meanwhile leaving room in the process to innovate. And as with any managed Kubernetes offering, for those already on the parent platform, integration will be a selling point.
Once a team has chosen a Kubernetes cloud provider, it could be interesting to look at other teams’ experiences, particularly failures. These post-mortems are a reflection of real-world cases—always a good starting point for developing one’s own cutting-edge best practices. I look forward to your comments below!
Further Reading on the Toptal Engineering Blog:
Understanding the basics
Container orchestration is the management and abstraction of all the resources revolving around running containers: configuration, resources, scaling, monitoring, networking, and tooling. Kubernetes is one of the most widely adopted container orchestration tools in the industry.
We need container orchestration to be able to efficiently manage and organize a fleet of containers running on servers. With container orchestration, we can build scalable, resilient, and powerful container-centric systems to deploy any application.
The benefit of using container orchestration with Kubernetes is to provide an abstraction layer on top of servers to run your containers. With Kubernetes, you are able to efficiently manage configuration and resources, and easily scale your infrastructure as needed.
Kubernetes is an open-source tool that has been developed based on Borg, a Google project. It is a production-grade container orchestration tool that creates a layer of abstraction on top of servers to allow the easy management of container scaling, monitoring, resource usage, networking, and configuration.