Ryan Cocks, Developer in London, United Kingdom
Ryan is available for hire
Hire Ryan

Ryan Cocks

Verified Expert  in Engineering

Bio

Ryan is an experienced software engineer of reliable and scaleable production Cloud systems. He specializes in DevOps, microservices, architecting applications, and application-level observability. He has a solid background in Cloud infrastructure and back-end work. He has good soft skills and has worked in teams of all sizes. Ryan has an excellent ability to understand the business needs behind requirements.

Portfolio

BCG - Gamma
Datadog, Site Reliability Engineering (SRE), Amazon EC2, DevOps, Microservices...
Toptal Project
Amazon Web Services (AWS), Terraform, Terragrunt...
Global Fashion Group
Amazon Web Services (AWS), DevOps, Docker, Amazon S3 (AWS S3), AWS CodeBuild...

Experience

Availability

Full-time

Preferred Environment

Amazon Web Services (AWS), MacOS, Google Cloud, Docker, Git, Kubernetes, Node.js, ECS

The most amazing...

...project I've worked on was the Rosetta project for Apple. This was a dynamic binary translator used to execute PPC binaries on X86.

Work Experience

Site Reliability Engineer (Datadog Specialist)

2021 - 2023
BCG - Gamma
  • Worked with multiple product teams within the organization, designing their observability (monitoring) solutions.
  • Guided teams on architectural considerations for observability. Defined observability best practices and coached the various teams.
  • Worked to get as close to real-time awareness of customer visible issues as possible.
  • Segmented alerting into different paths for different levels of severity.
  • Developed Terraform to set up dashboards and alerting for Kubernetes clusters and canonical architecture (fe/be+db) applications (Datadog).
Technologies: Datadog, Site Reliability Engineering (SRE), Amazon EC2, DevOps, Microservices, JavaScript, Kubernetes, Terraform, Flux, Monitoring, Application Monitoring, Infrastructure Monitoring, Cloud Infrastructure, Infrastructure as Code (IaC), Containers, GitHub Actions, Amazon Web Services (AWS), Back-end Performance, Database Performance, Cloud Engineering, VPS/VDS, GitOps, SIEM, Dashboard Development, Technical Documentation, Data Visualization, Load Balancers, DNS, Unix, Performance Analysis, Team Leadership, AWS ALB, Cloud, Transport Layer Security (TLS), AWS Cloud Architecture, Amazon Aurora

Site Reliability Engineer (ECS)

2020 - 2021
Toptal Project
  • Re-architected parts of the system that were vulnerable to high load, resulting in a perfect performance with no degradation during peak traffic Black Friday periods.
  • Launched the new version of their website on the new infrastructure. Completed with only 10 minutes of planned downtime. The total downtime over two years on the project was less than three hours.
  • Implemented alerting and monitoring for the new clusters.
  • Customized Fastly CDN to provide outage mitigation. Wrapped the endpoint for an unreliable 3rd-party API with a CDN-managed endpoint that redirected to a backup if latency was high on the main API.
  • Coached the team to improve their architectural designs according to the twelve-factor app principles and SRE best practices.
  • Created Terraform-managed AWS Fargate clusters for deployed services.
Technologies: Amazon Web Services (AWS), Terraform, Terragrunt, Amazon Simple Queue Service (SQS), Datadog, Sentry, Amazon CloudWatch, Amazon Elastic Container Service (ECS), AWS Fargate, Amazon EC2, Fastly, Amazon CloudFront CDN, Site Reliability Engineering (SRE), Monitoring, Application Monitoring, Infrastructure Monitoring, CI/CD Pipelines, Cloud Infrastructure, Infrastructure as Code (IaC), Configuration Management, Containers, GitHub Actions, AWS DevOps, Amazon RDS, Amazon S3 (AWS S3), Back-end Performance, Cloud Engineering, VPS/VDS, Continuous Integration (CI), Continuous Delivery (CD), AWS Lambda, DevSecOps, GitOps, Dashboard Development, Technical Documentation, Data Visualization, APIs, Load Balancers, DNS, Web Application Firewall (WAF), Network Administration, GitHub, Unix, Performance Analysis, Cloud Architecture, AWS ALB, AWS CLI, Cloud, AWS IAM, Transport Layer Security (TLS), AWS Cloud Architecture, Amazon Virtual Private Cloud (VPC), Amazon Aurora

Site Reliability Engineer (EKS)

2019 - 2019
Global Fashion Group
  • Created new Terraform-managed AWS EKS Kubernetes clusters (multi-region).
  • Executed live cluster migrations to new Kubernetes clusters with zero downtime.
  • Broke up a PHP back end into microservices, which improved reliability and scalability.
  • Moved from self-hosted services to AWS-managed ones, improving reliability using Redis and SQL databases.
  • Replaced Jenkins with AWS CodePipeline, which reduced maintenance costs.
  • Replaced legacy storage with S3, resulting in improved reliability.
  • Reworked database usage, eliminating bottlenecks during the high load.
Technologies: Amazon Web Services (AWS), DevOps, Docker, Amazon S3 (AWS S3), AWS CodeBuild, AWS CodePipeline, Helm, Terraform, Redis, Kubernetes, Site Reliability Engineering (SRE), Monitoring, Application Monitoring, Infrastructure Monitoring, CI/CD Pipelines, Cloud Infrastructure, Infrastructure as Code (IaC), Configuration Management, Containers, GitHub Actions, AWS DevOps, Amazon RDS, Back-end Performance, Database Performance, Cloud Engineering, VPS/VDS, Continuous Integration (CI), Continuous Delivery (CD), MySQL, AWS Lambda, DevSecOps, GitOps, Dashboard Development, Technical Documentation, Data Visualization, APIs, Load Balancers, DNS, Network Administration, NGINX, Amazon EKS, GitHub, Unix, Performance Analysis, Team Leadership, Cloud Architecture, AWS ALB, AWS CLI, Cloud, Memcached, AWS IAM, Transport Layer Security (TLS), AWS Cloud Architecture, Amazon Virtual Private Cloud (VPC), Amazon Aurora

DevOps Engineer and Release Manager

2016 - 2018
HERE Technologies
  • Designed and developed Jenkins deployment pipelines into AWS. Contributed to the programmatic generation of Jenkins pipelines using Job DSL.
  • Set up the production Docker on Amazon EC2 instances.
  • Ran the AWS autoscaling, microservices, Kafka, Flink, and windowed stream processing.
  • Developed IoT-specific testing that fed continuous test data into production. This allowed us to build real-time dashboards to identify which part of a complex microservices system was failing.
Technologies: Amazon Web Services (AWS), DevOps, Terraform, Node.js, JavaScript, Scala, Apache Kafka, Apache Flink, Microservices, Grafana, Splunk, Jenkins, Kubernetes, Docker, Monitoring, CI/CD Pipelines, Containers, Ansible, Cloud Engineering, Linux Server Administration, VPS/VDS, Continuous Integration (CI), Continuous Delivery (CD), AWS CloudFormation, Dashboard Development, Technical Documentation, Data Visualization, APIs, Cloud Architecture, Cloud, Transport Layer Security (TLS), AWS Cloud Architecture

Test Lead

2015 - 2016
HERE Technologies
  • Oversaw the analytics and A/B testing using Apptimize and Amplitude.
  • Developed test strategies for mobile devices.
Technologies: HockeyApp, Amplitude, Apptimize, iOS, Android, Containers, Ansible

Test Lead

2013 - 2014
Auckland Transport
  • Defined and executed test strategies for a citywide critical infrastructure.
  • Created tooling to optimize work methods.
Technologies: Ruby on Rails (RoR), MySQL, Ruby

Test Lead

2012 - 2013
Serato, Inc.
  • Oversaw and mentored junior developers.
  • Introduced tools and processes for bug tracking, test management, peer review, crash report collection and analysis, beta test cycles, and improving the communication between customer support and product management teams.
  • Tested iOS apps.
  • Aided Scrum teams to adopt best practices in their testing and quality control.
Technologies: Testing, Engineering, Ruby

Test Team Manager

2011 - 2012
IBM
  • Oversaw the management and technical rigor for a team of 11 testers. This included five products in flight from IBM's virtualization, security, operating system performance, and failover stacks.
  • Changed the way the development and QA teams interacted by focusing on rapid iterative feedback. This reduced the release cycles from 2-3 months down to 2-3 weeks.
  • Successfully oversaw two new major product launches.
Technologies: Virtual Machines, C++, Containers, Team Leadership

Project Manager

2010 - 2011
IBM
  • Managed the development and release cycle for a small software team.
Technologies: Ruby on Rails (RoR), C++, Containers

C++ Developer

2001 - 2009
Transitive
  • Developed automated testing infrastructure, including toolchains (cross-linking and bootstrapping build systems), assembly, linkers, CPU, and memory management architecture (SPARC, x86, X86_64, ARM, Itanium), and Linux kernel patching and building.
  • Developed dynamic binary translators that would load binaries for one processor and execute them on another using UNIX kernel interface (syscalls).
  • Acted as the lead engineer on a specialist performance analysis team. Studied the principles of performance analysis and improvement and applied them to solve performance issues when clients experienced lower-than-expected on-site performance.
Technologies: Linux, C++, Containers, Back-end Performance, Software Engineering, Linux Server Administration, DNS, Network Administration, Ruby, Unix, Performance Analysis

Observability Expert

Filled the role of in-house observability expert for one of the Big 3 consulting firms. I was the main contact point in the organization for development teams looking to improve the observability of their deployments, specifically with Datadog at the client's request. I designed observability solutions for various products and projects and covered ECS and Kubernetes on AWS and Azure. Since many products were on Kubernetes with canonical front-end/back-end architectures, I produced Terraform to install baseline standard monitoring. This entailed monitoring the Kubernetes clusters, databases, LBs, front-end and back-end services, Watch Dogs, SLOs, and uptime.

I was involved in setting up Kubernetes monitoring, becoming an expert in this area. I also developed custom dashboards for rapid situational awareness for Kubernetes clusters. Bringing together monitoring (and alerting) on OOMs, crash-loop backoff, container restarts, resource usage vs. limits, node resources, pod desired state, and unavailable deployment replicas.

Automated Stocks and Crypto Trading Systems

I've worked extensively on personal projects in the crypto and stocks/foreign exchange trading space. I did low-frequency swing trading but used this as a personal project to keep my developer skills honed; I worked as a developer for ten years before specializing in DevOps.

I performed backtesting in Python real-time systems as Node.js microservices deployed on Kubernetes.
2014 - 2014

Scrum Master in Scrum

Clarus (Agile Coaching) - New Zealand

2012 - 2012

ISTQB Foundation Certificate in Software Testing

ISTQB - New Zealand

1998 - 2000

Bachelor of Science Degree in Computer Science

The University of Manchester - United Kingdom

JANUARY 2014 - PRESENT

Scrum Master

Clarus (scrum.org)

JANUARY 2012 - PRESENT

ISTQB

ISTQB

Libraries/APIs

Terragrunt, Node.js, Jenkins Job DSL, Amazon EC2 API, PubSubJS

Tools

Jenkins, Amazon Elastic Container Service (ECS), Terraform, Git, Fastly, GitHub, Sentry, Google Kubernetes Engine (GKE), Amazon EKS, RabbitMQ, Helm, Amazon Simple Queue Service (SQS), Amazon CloudWatch, AWS Fargate, Amazon CloudFront CDN, NGINX, Amazon Virtual Private Cloud (VPC), Splunk, Grafana, AWS CodeBuild, Amazon Simple Notification Service (SNS), Bitbucket, Ansible, AWS CloudFormation, AWS CLI, AWS IAM

Languages

Perl, Bash, C++98, JavaScript, Ruby, TypeScript, C++, Scala, Python, SQL

Paradigms

Microservices, DevOps, Agile, Continuous Integration (CI), Continuous Delivery (CD), DevSecOps, Testing

Platforms

Docker, Apache Kafka, Kubernetes, Linux, Amazon Web Services (AWS), Unix, AWS ALB, Amazon EC2, AWS Lambda, DigitalOcean, MacOS, Android, iOS, HockeyApp, Apache Flink, Google Cloud Platform (GCP)

Storage

Datadog, Amazon S3 (AWS S3), Redis, Memcached, Amazon Aurora, Google Cloud, MongoDB, PostgreSQL, JSON, Database Performance, MySQL

Frameworks

Ruby on Rails (RoR), Flux

Industry Expertise

Trading Systems

Other

Monitoring, Site Reliability Engineering (SRE), Infrastructure Monitoring, CI/CD Pipelines, Infrastructure as Code (IaC), Containers, AWS DevOps, Cloud Engineering, GitOps, Dashboard Development, Technical Documentation, APIs, Load Balancers, DNS, Performance Analysis, Cloud, AWS Cloud Architecture, Virtual Machines, Lambda Functions, Application Monitoring, Cloud Infrastructure, Configuration Management, GitHub Actions, Amazon RDS, ECS, Back-end Performance, Software Engineering, VPS/VDS, Network Administration, Team Leadership, Cloud Architecture, Transport Layer Security (TLS), Engineering, Apptimize, Amplitude, Google Cloud Functions, AWS CodePipeline, Scrum Master, Financial APIs, Stock Trading, Forex Trading, TradingView, Linux Server Administration, SIEM, Data Visualization, Web Application Firewall (WAF)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring