Case Study

How Calm Achieved Better Production Stability with EKS on AWS

“Chris' senior-level experience with AWS best-practices fast-tracked our company's infrastructure development. His work was a crucial milestone that enabled us to scale our engineering teams and systems in step with our rapid growth.”

Mark Marcantano, Technical Program Manager at Calm

“Calm is really taking leaps forward to ensure their customers have the best possible user experience in terms of stability and performance. Moving to AWS EKS further enabled Calm to focus on product, velocity, and user experience without concerns for the operational overhead and complexities that Kubernetes can introduce.”

Christopher Stobie, Principle DevOps Engineer at Toptal

The Challenge Calm Faced When an Unexpected Outage Brought Their System Down

Many companies are evolving their IT solutions to move from virtualization to containerized solutions, allowing them to abstract away differences in OS distributions and underlying infrastructures. Kubernetes is an open-sourced container management system that provides mechanisms for deploying, maintaining, and scaling containerized applications, and is the system Calm had put into place for its own operations, using the standard industry tools that existed at the time.

Calm had hired Christopher Stobie, a senior engineer through Toptal’s AWS DevOps practice, in order to supplement their current resources, as they simply didn’t have enough people with the necessary skills to manage the systems they already had in place. On Chris’s 2nd day on the job, what Calm subsequently referred to as the “Great and Terrible Outage” occurred as a result of Etcd corruption in the self-managed k8s control plane rolling the system back to its legacy infrastructure, with catastrophic consequences. “Calm was running Kubernetes entirely by themselves, which is very hard to do,” notes Chris. “But on my 2nd day at Calm, there was a two-day outage, a Kubernetes failure, and the control plane was corrupted and unrecoverable.”

Despite the dire situation, Chris was able to build a new, fully-automated cluster that would be managed by AWS instead of self-managed. He developed the system to run under EKS, creating a whole networking layer as code in Terraform and enabling Calm to be fully functional again. Because of the ease of use of the AWS solution, the migration to EKS only took about three days.

An Immediate Beneficial Outcome

Though prompted by an unexpected emergency situation, the migration had immediate results. The control plane saw improved stability immediately, and the networking overhead within the cluster was significantly reduced. In addition, the source-controlled cluster configuration allowed for quick iterations, and the IAM authorization setup was extremely easy.

The metrics for success that most companies running an IT environment would use – uptime, resiliency, ability to depend on production environments – saw substantial improvement after the switch to EKS. Previously, the cost of downtime alone was significant, with each outage costing approximately $40K per hour as Calm was unable to subscribe users. In the six months since the EKS deployment, networking has become much more reliable, and the speed at which the server returns responses means that DevOps is no longer waiting for auto-complete to come back for suggested deployments.

Calm and AWS

Calm is a leading global health and wellness brand with the #1 app for sleep, meditation and relaxation. The company is on a mission to make the world happier and healthier. With hundreds of hours of original audio content, the Calm app helps users cope with some of the most important mental health issues of the modern age including anxiety, stress and insomnia. Apple’s 2017 iPhone App of the Year and Inc’s 19th fastest growing company boasts over 51 million downloads to date, averaging 80,000 new users daily.

The Team

Mark Marcantano

Mark Marcantano

Technical Program Manager, Calm

Mark is a Technical Program Manager at Calm with extensive experience in application development and the implementation of Scrum processes across the organization. As a seasoned DevOps engineer, Mark possesses deep expertise in AWS, Kubernetes, Data Automation, Data Engineering, Deployment Pipelines, Monitoring, Auditing and SecOps.
Christopher Stobie

Christopher Stobie

Principle DevOps Engineer, Toptal

Chris is a seasoned principal DevOps engineer with more than seven years of experience in building and managing applications. He is formally the Director of DevOps/SRE at an AI company, Veritone, in Costa Mesa, CA, and a solutions architect at AWS. Chris has led numerous large-scale cloud architecture designs and implementations including building a real-time serverless AI platform on AWS.

Our Partnership

Our Partnership
As an Advanced Consulting Partner in the Amazon Partner Network (APN), Toptal provides cloud solutions for companies and works with them in every stage of their journey.

Bold and Innovative Thinking Pays Off

While the AWS EKS system isn’t the only one in the managed Kubernetes marketplace, it certainly showcases the depth and breadth of AWS technology and expertise. And in selecting EKS as an early adopter, Calm displayed the forward thinking that is a hallmark of the best companies, as they implement technologies that will assure the most seamless client and customer experiences. In this case, Calm recognized early on that they needed extra assistance, and turned to Toptal knowing that Toptal would have the resources needed for such a monumental undertaking. A key lesson here for other companies relates to understanding that technology itself is not enough: while this success would not have been possible without the agile superiority of AWS cloud and technology offerings, the “human talent cloud” with experience in implementation is imperative as well. This combination enabled Calm to institute a robust system that is in full production, a distinction that is relatively unique and gives them an advantage in the marketplace. Now, six months into the EKS rollout, the experience that Calm has had shows that their innovative path is one that is continuing to pay dividends and will do so for some time into the future.

Faster server responses, which saves significant time previously spent waiting for auto-complete.
Faster server responses, which saves significant time previously spent waiting for auto-complete.
Reliable networking, saving money by preventing unexpected outages.
Reliable networking, saving money by preventing unexpected outages.
Fully-automated cluster managed by AWS, which allows Calm to focus on other priorities.
Fully-automated cluster managed by AWS, which allows Calm to focus on other priorities.
Calm and AWS

Calm is a leading global health and wellness brand with the #1 app for sleep, meditation and relaxation. The company is on a mission to make the world happier and healthier. With hundreds of hours of original audio content, the Calm app helps users cope with some of the most important mental health issues of the modern age including anxiety, stress and insomnia. Apple’s 2017 iPhone App of the Year and Inc’s 19th fastest growing company boasts over 51 million downloads to date, averaging 80,000 new users daily.

The Team

Mark Marcantano

Mark Marcantano

Technical Program Manager, Calm

Mark is a Technical Program Manager at Calm with extensive experience in application development and the implementation of Scrum processes across the organization. As a seasoned DevOps engineer, Mark possesses deep expertise in AWS, Kubernetes, Data Automation, Data Engineering, Deployment Pipelines, Monitoring, Auditing and SecOps.
Christopher Stobie

Christopher Stobie

Principle DevOps Engineer, Toptal

Chris is a seasoned principal DevOps engineer with more than seven years of experience in building and managing applications. He is formally the Director of DevOps/SRE at an AI company, Veritone, in Costa Mesa, CA, and a solutions architect at AWS. Chris has led numerous large-scale cloud architecture designs and implementations including building a real-time serverless AI platform on AWS.

Our Partnership

Our Partnership
As an Advanced Consulting Partner in the Amazon Partner Network (APN), Toptal provides cloud solutions for companies and works with them in every stage of their journey.

Download a PDF version of this case study.

Download PDF