Vaibhav Jain, Developer in Dubai, United Arab Emirates
Vaibhav is available for hire
Hire Vaibhav

Vaibhav Jain

Verified Expert  in Engineering

Bio

Vaibhav is a seasoned DevOps engineer with extensive expertise. Holding AWS and Kubernetes certifications (CKA/CKAD/CKS), he excels in building infrastructures from the ground up for diverse businesses, from startups to industry giants. Proficient in AWS, GCP, Kubernetes, Docker, Jenkins, Terraform, and Linux, Vaibhav is adept at managing large-scale operations across sectors like media, fintech, travel, eCommerce, and digital marketing.

Portfolio

Self-employed
DigitalOcean, LAMP, Linux, Redis Queue, Python 3, Microsoft SQL Server...
INNERWORKS TECHNOLOGY LIMITED
DevOps, Load Testing, Google Cloud Platform (GCP), Kubernetes, GCP Security...
Paytm E-Commerce
Amazon Web Services (AWS), EFK Stack, Amazon CloudFront CDN, Jenkins, Ansible...

Experience

  • DevOps - 10 years
  • Amazon EKS - 10 years
  • Continuous Delivery (CD) - 10 years
  • Continuous Integration (CI) - 10 years
  • Linux - 10 years
  • Amazon Web Services (AWS) - 10 years
  • Kubernetes - 10 years
  • Prometheus - 6 years

Availability

Part-time

Preferred Environment

Prometheus, Amazon Web Services (AWS), Kubernetes, Google Cloud Platform (GCP), MongoDB, Monitoring, CI/CD Pipelines, GitOps, Service Meshes, NGINX

The most amazing...

...thing I've accomplished is an exceptional feat: crafting a scalable eCommerce infrastructure for seamless high-traffic operations and zero-downtime deployments.

Work Experience

DevOps Engineer and Kubernetes Developer (via Toptal)

2020 - PRESENT
Self-employed
  • Designed and developed cloud-based SaaS solutions.
  • Architected a hosting platform on top of DigitalOcean and Kubernetes.
  • Developed and provided support for the existing MVP.
  • Wrote the Terraform modules for a number of AWS services, including AWS AppSync, Amazon Cognito, Amazon DynamoDB, and Amazon VPC.
  • Created multiple identical environments in a short period using Terraform and tore them down after use to save cost.
  • Imported the existing AWS infrastructure to Terraform to remove any manual management overhead.
  • Set up dynamic environments for developers. Used Helm and Shell to provide developers with the flexibility to create their own development environments on the fly and tear them up to save on costs.
  • Implemented infrastructure and application monitoring and alerts using Datadog.
  • Created a cost-saving plan and handled the execution on Google Compute Engine.
  • Migrated the staging and development environment from a production Kubernetes environment to separate AWS accounts.
Technologies: DigitalOcean, LAMP, Linux, Redis Queue, Python 3, Microsoft SQL Server, Kubernetes, Google Kubernetes Engine (GKE), Amazon EKS, Terraform, AWS AppSync, Amazon Cognito, Amazon S3 (AWS S3), Amazon DynamoDB, Amazon Virtual Private Cloud (VPC), Cost Modeling, CircleCI, Jenkins Pipeline, Elasticsearch, Amazon Web Services (AWS), DevOps, Datadog, Service Meshes, Content Delivery Networks (CDN), DNS, Traefik, Storage, VPN, IT Operations Management (ITOM), On-premise, GitHub Actions, Amazon RDS, GitHub, Amazon MSK, Amazon ElastiCache, Software Architecture, PostgreSQL, Amazon Route 53, Git, Containers, Database Optimization, Automation, Network Automation, Server Administration, SSL Certificates, SSL, Security, Flux, Argo CD, Cloudflare, ChatGPT, Chatbots, Keycloak, Bash, Site Reliability Engineering (SRE), DevSecOps, GitLab, React, Policy as code (PaC), Service Mesh, Compliance as Code (CaC), Vault, RabbitMQ, PHP, SysOps, VPS/VDS, Rancher, ECS, Software Engineering, Cloud Engineering, Docker Swarm, AWS Fargate, Cloud Migration, Google Cloud Functions, Google Cloud SQL, Google Cloud Storage, Cloud Security, SOC 2, SecOps, NoSQL, AWS CloudFormation, Orchestration, Infrastructure as a Service (IaaS), DevOps Engineer, Web Application Firewall (WAF), Single Sign-on (SSO), Firewalls, Network Security, SIEM, JSON, APIs, Dependabot, IT Security, Event-driven Architecture, Load Testing, Scalable Web Services, Terragrunt, Cloud Monitoring, Cloudways, Google APIs, AWS IAM, Amazon Simple Queue Service (SQS), Artifactory, GitLab CI/CD, Google App Engine, Continuous Delivery (CD), Amazon Elastic Container Registry (ECR), Linux Administration, Linux Server Administration, Consul, Redis, Go, Continuous Development (CD), Continuous Integration (CI), AWS CloudTrail, Packer, Grafana, Cloud Services, Scaling, Ingress Controllers, Real-time Systems, Bash Script, Groovy Scripting, AWS CLI, Cloud Computing, Containerization, Kubernetes HorizontalPodAutoscaler (HPA), Container Orchestration, SQL, Google Cloud, Elastic, Imperva Incapsula, AWS SDK, Fluentd, Apache Kafka, MySQL/MariaDB, Bitbucket, Istio, Kubeflow, Kubernetes Security, Platform Engineering, Apache Airflow, Data Pipelines, Performance, Troubleshooting, Networking, Clustering, Canary Deployment, Crossplane, Scalable Architecture, System Architecture, Large Language Models (LLMs), CTO, Artificial Intelligence (AI), Language Models, AI Automation, Back-end APIs, Kafka Streams, Microservices, Redis Streams, Machine Learning Operations (MLOps)

GCP Security Consultant (via Toptal)

2024 - 2024
INNERWORKS TECHNOLOGY LIMITED
  • Scaled Kubernetes application pods using custom scaling parameters using Keda.
  • Set up custom metrics for a Node.js application using the Prometheus library.
  • Did load testing using Gatling and tuned the application based on the testing result.
Technologies: DevOps, Load Testing, Google Cloud Platform (GCP), Kubernetes, GCP Security, Website Performance, Linux Administration, Linux Server Administration, Continuous Development (CD), Continuous Integration (CI), Grafana, Cloud Services, Scaling, Ingress Controllers, Real-time Systems, Bash Script, AWS CLI, SaaS, Cloud Computing, Containerization, Kubernetes HorizontalPodAutoscaler (HPA), Container Orchestration, Google Cloud, Apache Kafka, Bitbucket, Troubleshooting, Scalable Architecture, ClickHouse

DevOps Engineer

2017 - 2018
Paytm E-Commerce
  • Deployed and managed applications on self-hosted Kubernetes clusters on AWS EC2 instances.
  • Formulated SaltStack formulas for infrastructure automation.
  • Wrote Terraform modules for AWS services automation.
  • Implemented the EFK logs pipeline for log aggregation of around 1TB of logs daily without any lag.
  • Developed a Jenkins global library for unified build and release across hundreds of microservices.
  • Reduced service onboarding time to minutes using Jenkins, Groovy, and Salt automation.
  • Designed and implemented flash-optimized in-memory open source NoSQL DB Aerospike for caching.
  • Maintained daily operational activity on AWS infrastructure and continuous debugging. The highest traffic handled around 1 million RPM during the sales season.
  • Led most of the infrastructure integration projects across teams.
Technologies: Amazon Web Services (AWS), EFK Stack, Amazon CloudFront CDN, Jenkins, Ansible, SaltStack, Kubernetes, Docker, Node.js, Java, CI/CD Pipelines, Infrastructure, Scalability, Amazon EC2, AWS DevOps, Amazon Simple Email Service (SES), AWS Cloud Architecture, Amazon Elastic Container Service (ECS), Configuration Management, Data Centers, AWS CloudFormation, Infrastructure as a Service (IaaS), HIPAA Compliance, AWS Cloud Computing Services, Serverless, Amazon Aurora, AWS ALB, Kibana, AWS HA, Amazon Elastic Container Registry (ECR), Linux Server Administration, AWS Lambda, Continuous Development (CD), Continuous Integration (CI), AWS CloudTrail, Packer, eCommerce, Grafana, Cloud Services, Scaling, Ingress Controllers, Real-time Systems, Bash Script, Groovy Scripting, AWS CLI, SaaS, Cloud Computing, Containerization, Kubernetes HorizontalPodAutoscaler (HPA), Serverless Architecture, Container Orchestration, Elastic, AWS SDK, Aerospike, Fluentd, Apache Kafka, Troubleshooting, Scalable Architecture, ClickHouse, OVH

DevOps Consultant

2016 - 2017
HCL Technologies
  • Implemented dynamic CI and CD using GitHub Enterprises, Jenkins multibranch jobs, XL-Release, XL-Deploy, and Nexus for Java and Microsoft Technologies.
  • Built and deployed Microsoft libraries using open-source build tools on Linux machines (candle and light).
  • Migrated manual deployments to automated pipelines with workflows.
  • Introduced and implemented log aggregation using ELK stack.
  • Participated in server patching activities and network monitoring.
Technologies: Amazon Web Services (AWS), ELK (Elastic Stack), Jenkins, Windows, Linux, Docker Hub, High Availability Disaster Recovery (HADR), Python, System Administration, Virtual Machines, Windows Server, Identity & Access Management (IAM), Linux Server Administration, Continuous Development (CD), Continuous Integration (CI), Grafana, Cloud Services, Scaling, Ingress Controllers, Bash Script, Groovy Scripting, AWS CLI, SaaS, Cloud Computing, Containerization, Container Orchestration, AWS SDK, Insurance, Fluentd, Apache Kafka, XL Release REST API, Troubleshooting, Scalable Architecture

Software Engineer

2014 - 2016
Cybage Software
  • Migrated infrastructure from legacy management to infrastructure as code that will enable developers to understand and modify infrastructure according to their application needs.
  • Scheduled infrastructure and apps for early morning hours to save infrastructure costs; costs were reduced by around 40%.
  • Set up SonarQube across the organization for hundreds of projects to improve code quality and provide developers with better visibility of potential issues in the code.
  • Set up Jenkins automated deployments to hybrid infrastructure consisting of virtual machines and Docker containers.
Technologies: Amazon Web Services (AWS), SonarQube, Jenkins, Node.js, Linux, Terraform, Beanstalk, Angular, JavaScript, Scripting, Containers, Amazon Virtual Private Cloud (VPC), Amazon Elastic Block Store (EBS), Linux Server Administration, Continuous Development (CD), Continuous Integration (CI), Packer, Grafana, Cloud Services, Scaling, Ingress Controllers, Real-time Systems, Bash Script, AWS CLI, SaaS, Cloud Computing, Containerization, Container Orchestration, Apache Maven, AWS SDK, Apache Kafka, Bitbucket, Troubleshooting, Scalable Architecture

Experience

Dynamic Environments

I set up dynamic environments on a needs basis, so that developers can spin up their environment in minutes without any DevOps involvement, do their development, and tear them down once done, which is cost-effective and scalable, and each developer has their own environment. This has increased the release cycle from a few build per day to tens of builds per day to production.

Migration to Kubernetes

I migrated an entire client infrastructure to Kubernetes from data centers and public cloud instances with more than 300 microservices running in an ad-hoc way without any visibility. I also set up a unified build and deployment pipeline with Jenkins.

Environment Segregation | AWS

The client had been running one Kubernetes cluster for all environments: development, staging, and production and to address this issue, I replicated the entire environment into new AWS accounts, wrote Terraform modules from scratch, and imported all the Terraform states for the existing resources to make everything in sync, where everything is hosted using AWS services, and the application stack is Clojure.

Education

2014 - 2014

Postgraduate Diploma in Advanced Computing

C-DAC: Centre for Development of Advanced Computing - India

2009 - 2013

Bachelor of Technology Degree in Electrical and Electronics Engineering

BM Institute of Engineering and Technology - India

Certifications

JULY 2021 - JULY 2023

Certified Kubernetes Security Specialist (CKS)

CNCF [Cloud Native Computing Foundation]

FEBRUARY 2021 - FEBRUARY 2023

HashiCorp Certified: Terraform Associate

HashiCorp

JANUARY 2020 - JANUARY 2022

Certified Kubernetes Application Developer (CKAD)

CNCF [Cloud Native Computing Foundation]

DECEMBER 2019 - DECEMBER 2022

Certified Kubernetes Administrator (CKA)

CNCF [Cloud Native Computing Foundation]

OCTOBER 2019 - PRESENT

Certified Solution Architect Associate

Amazon Web Services

Skills

Libraries/APIs

Jenkins Pipeline, REST APIs, Back-end APIs, Node.js, XL Release REST API, React, Terragrunt, Google APIs, Redis Queue

Tools

GitHub, CircleCI, AWS IAM, NGINX, Amazon CloudFront CDN, Amazon Elastic Container Registry (ECR), Amazon Elastic Block Store (EBS), AWS CloudTrail, Docker Hub, Google Kubernetes Engine (GKE), Amazon Virtual Private Cloud (VPC), Amazon EKS, Terraform, Grafana, ELK (Elastic Stack), Packer, SonarQube, Apache Maven, Helm, Istio, Jenkins, Kubernetes HorizontalPodAutoscaler (HPA), Docker Compose, AWS CLI, Amazon Simple Queue Service (SQS), Git, Vault, RabbitMQ, AWS CloudFormation, Kafka Streams, GitLab, Bitbucket, AWS SDK, Ansible, Kibana, GitLab CI/CD, SaltStack, Fluentd, Amazon CloudWatch, Amazon Elastic Container Service (ECS), Artifactory, Traefik, AWS Fargate, VPN, Amazon ElastiCache, Amazon Simple Email Service (SES), ChatGPT, Keycloak, Docker Swarm, Apache Airflow, Elastic, EFK Stack, Shell, AWS AppSync, Amazon Cognito, Beanstalk, Gradle, GCP Security

Languages

SQL, TypeScript, Bash, Bash Script, JavaScript, Python, PHP, Java, Go, Python 3, GraphQL

Frameworks

AWS HA, Flux, Angular, Crossplane, Express.js

Paradigms

Continuous Development (CD), Continuous Integration (CI), DevOps, Continuous Delivery (CD), Serverless Architecture, DevSecOps, Automation, Event-driven Architecture, Load Testing, Microservices, Real-time Systems, HIPAA Compliance

Platforms

Google Cloud Platform (GCP), AWS ALB, Amazon Web Services (AWS), Kubernetes, Linux, Docker, Apache Kafka, Amazon EC2, Windows Server, Imperva Incapsula, Windows, Google App Engine, DigitalOcean, AWS Cloud Computing Services, Kubeflow, AWS Lambda, LAMP, Rancher, Heroku

Storage

PostgreSQL, Datadog, Google Cloud, SQL Performance, Data Centers, Google Cloud SQL, Google Cloud Storage, NoSQL, Amazon S3 (AWS S3), Elasticsearch, Amazon DynamoDB, MongoDB, MySQL/MariaDB, Aerospike, MySQL, Redis, Databases, Amazon Aurora, On-premise, JSON, Cloudways, Data Pipelines, ClickHouse, OVH, Microsoft SQL Server

Industry Expertise

Insurance

Other

Cloudflare, Identity & Access Management (IAM), Site Reliability Engineering (SRE), Ingress Controllers, Scaling, Autoscaling, Website Performance, AWS DevOps, Kubernetes Operations (kOps), Migration, Cloud Infrastructure, Infrastructure as a Service (IaaS), CI/CD Pipelines, eCommerce, Cloud Services, AWS Certified Developer, Infrastructure as Code (IaC), Container Orchestration, GitHub Actions, Cloud, Amazon RDS, Cloud Security, Containerization, Cloud Computing, Content Delivery Networks (CDN), Cloud Architecture, DevOps Engineer, Infrastructure, Architecture, Scalability, High Availability Disaster Recovery (HADR), Monitoring, GitOps, IT Operations Management (ITOM), AWS Cloud Architecture, Software Architecture, Amazon Route 53, IT Systems Engineering, Containers, IOPS, Database Optimization, Network Automation, Server Administration, Virtual Machines, AWS Certified Solution Architect, SSL Certificates, SSL, Security, Argo CD, Chatbots, Service Mesh, Software Engineering, Cloud Engineering, Cloud Migration, Google Cloud Functions, SecOps, Load Balancers, Orchestration, Web Application Firewall (WAF), Network Security, SIEM, IT Security, Serverless, Scalable Web Services, Cloud Monitoring, Kubernetes Security, Platform Engineering, Performance, Troubleshooting, Networking, Clustering, Certified Kubernetes Administrator (CKA), Canary Deployment, Scalable Architecture, System Architecture, CTO, Artificial Intelligence (AI), Language Models, AI Automation, Redis Streams, System Administration, Linux Administration, Linux Server Administration, Scripting, Prometheus, Consul, Groovy Scripting, Web Security, SaaS, Service Meshes, DNS, API Gateways, Storage, Monorepos, Amazon MSK, Configuration Management, Policy as code (PaC), Compliance as Code (CaC), SysOps, ECS, SOC 2, Single Sign-on (SSO), Firewalls, APIs, Dependabot, Disaster Recovery Plans (DRP), Machine Learning Operations (MLOps), Shell Scripting, Cost Modeling, Electrical Engineering, Electronics, Software Development, Operating Systems, Data Structures, VPS/VDS, Large Language Models (LLMs)

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring