Deepak Agrawal
Verified Expert in Engineering
DevSecOps and DevOps Engineer and Developer
Gurugram, Haryana, India
Toptal member since October 12, 2022
Deepak is a cloud architect, specialist, consultant, SRE, and observability engineer with over a decade of intense professional experience. He has architected and built multiple platform-agnostic infrastructures from scratch for modern cloud systems. Deepak has a proven track record of hands-on operations in high-scale environments and is proficient in cost optimization, IaC, automation, cloud security, migrations, deployment pipelines, and observability.
Portfolio
Experience
- DevOps - 10 years
- Python 3 - 8 years
- Terraform - 7 years
- Kubernetes - 5 years
- Ansible - 5 years
- Cloud Security - 5 years
- Jenkins - 4 years
- ELK (Elastic Stack) - 3 years
Availability
Preferred Environment
Kubernetes, Amazon Web Services (AWS), Python 3, Terraform, Jenkins, Ansible, Cloud Security, ELK (Elastic Stack), PostgreSQL, Cost Reduction & Optimization (Cost-down)
The most amazing...
...projects I've done are designing and architecting a cloud infrastructure with Terraform and optimizing AWS cost by 50% for multiple companies.
Work Experience
Senior Azure DevOps Engineer
Coverself
- Integrated JumpCloud with Azure AD, managed users in Azure AD, and connected them to resources via JumpCloud.
- Created an AWS directory and workspace. Restricted outgoing traffic except for approved domains using security groups. Attached policy-based integrations manually or through Terraform automation.
- Deployed and tested managed identity-based access with Terraform (automation). Set up a cluster and resources with Terraform. Deployed a demo Spring Boot application on the Azure cluster. Tested application access using the assigned managed identity.
- Handled one-click infrastructure and application deployment with Terraform. Utilized Terraform and Terragrunt for automated deployment. Deployed the necessary resources: MySQL, storage, ACR, and Cosmos DB.
- Leveraged managed identity for authentication during deployment. Modified the pipeline to build GitLab modules to work according to the infrastructure and use those modules in Terragrunt.
DevOps Consultant
Monsoon CreditTech Pvt
- Set up SSO with Zoho Directory for AWS, GCP, and Azure portals and helped the company with various CISA audit requirements.
- Set up Teleport VPN for SSH access and to access other internal tools, e.g., Jenkins, PyPI, Jupyter Notebook, HashiCorp Vault, and Grafana.
- Set up automation for the developer's VM creation using Terraform and Ansible on GCP and Azure.
DevOps Consultant
Vodex.ai: AI-Powered Voice Outbound Calls Solution Provider
- Architected, designed, and built the Kubernetes platform from scratch in AWS using Terraform for a distributed microservices architecture.
- Created Helm charts for the applications and integrated them with the GitOps CD pipeline using ArgoCD. Configured the CI pipeline using GitHub Actions.
- Set up all DevOps internal tools, e.g., Argo CD, EFK cluster, Thanos, Prometheus, Grafana, and the Alertmanager, in a centralized EKS cluster using Helm charts.
- Set up monitoring in a centralized AKS cluster using Thanos and ingested metrics from environment-specific Prometheus setup.
- Set up logging in the centralized EKS cluster using EFK and ingested logs from environment-specific EKS clusters. Set up application Load Balancer/Ingress Controller using Helm charts.
- Set up a lot of Grafana dashboards for visualization and meaningful metrics and the Alertmanager for alerting.
- Created detailed architecture diagrams and documentation for infrastructure, CI/CD (GitHub and ArgoCD), EFK, Ingress, Thanos HA setup with Prometheus, Grafana, and the Alertmanager.
- Provisioned VPC, ECR, EKS, RDS, Amazon ElastiCache, etc., using Terraform.
DevOps Consultant
Infra360 (Cloud Consulting Company)
- Optimized the AWS cost by 40% for two clients by taking several measures like removing cloud waste, right-sizing, DB parameter tuning, etc.
- Migrated complete Infrastructure and PostgreSQL database from one AWS account to another in the same region using AWS DMS for PostgreSQL for the Housing.com marketing team.
- Built multiple custom CI/CD Bitbucket pipelines for deploying infrastructure using Terraform. Set up SSO in AWS for accessing internal tools.
- Set up a Redash cluster with complete automation using Terraform and Ansible for an ERP-based enterprise company.
- Built a single-click deployment of an ELK cluster for version 8.x with complete automation using Ansible and Terraform with SSL encryption.
- Built a WordPress website from scratch using a pre-built theme.
DevOps Consultant
SwymCorp: US-Based Leading E-Commerce Platform
- Architected, designed, and built the Kubernetes platform from scratch for the organization in Azure Kubernetes Service using Terraform for a distributed microservices architecture with Azure CNI networking and deployed five microservices in production.
- Built CI using Declarative Jenkins pipelines and set up an ECR repository using Terraform. Created Helm charts for the applications and integrated them with the GitOps CD pipeline using Argo CD.
- Configured Azure CNI networking and ACR integration in the AKS cluster and configured RBAC to access the AKS cluster using Azure AD authentication.
- Deployed three microservices in Azure App Service in production and staging environments using deployment slots.
- Set up Application Gateway Ingress Controller using Helm charts. Set up monitoring in a centralized AKS cluster using Thanos and ingested metrics from environment-specific AKS Cluster Prometheus setup.
- Set up the Alertmanager for alerting and a lot of Grafana dashboards for visualization and meaningful metrics. Set up Argo CD using Helm charts and integrated Argo CD with multiple AKS clusters for multiple environment deployments.
- Did secret management using Azure Key Vault (AKV) Provider for Secrets Store CSI Driver in an AKS cluster. Worked on Azure Redis Service and Azure Cosmos DB service in production in a microservices environment.
- Created detailed architecture diagrams for infrastructure and CI/CD pipelines. Also created the documentation for every process and component of the infrastructure with commands and instructions.
DevOps Consultant
Uptycs: US-Based Cyber Security Company
- Built a feature in Python 3/Boto3 related to effective permissions for a user/role in AWS Identity and Access Management (IAM) for a well-known US-based cybersecurity company.
- Worked on this module that provides the summary for a particular user/role in a big enterprise where we can get which actions for which service is allowed/denied for a list of resources, and resource policy was also taken into consideration in this module.
- Contributed to this module that provides a summary at the identity level (user/role) and the resource level. This advanced feature is not even available in AWS IAM.
Manager | Global Infrastructure
Sequoia Capital
- Built an in-house CLI tool for tagging AWS resources in Python 3 and made all 17 AWS accounts 100% tagging compliant.
- Evaluated the CloudHealth control tower by VMWare and integrated it to manage all AWS accounts and GCP projects. Configured policy-based governance for zombie resources, tagging governance, cost, security, container management, and operations.
- Migrated complete Infrastructure and PostgreSQL database from one AWS account to another in the same region using AWS DMS for PostgreSQL. Worked on Shared VPCs, NACL. Set up the CodeBuild and CodeDeploy pipelines from scratch using Terraform.
- Set up containerized Kubernetes EKS infrastructure to deploy new apps for development, test, stage, and production environments with 100% IaC. Used Terraform with best cost and security practices, including Identity and access management (IAM).
- Shared cost and security best practices with teams in different GEOs in Sequoia and reduced the AWS bill by $23,000 per month in two months.
- Performed Well-architected Reviews (WAR) for all Sequoia AWS accounts and multiple portfolio companies. Configured security services, including Macie, Inspector, GuardDuty, configuration, IAM, WAF, SCP, Security Hub, and Cloudflare.
- Provided recommendations on best practices for cloud infrastructure architecture, cost optimization, and cloud security for multiple Asia-based portfolio companies, including Pentester Academy, Checkbox, Enterpret, and FlowAccount.
DevOps Consultant
epiFi
- Built a fully automated federated role-based cross-account with IAM access through SAML for all employees, based on their designation, namely developer, lead, and DevOps. Used Terraform, GitHub, Jenkins, and Groovy.
- Created a DevOps dashboard from scratch in Python and Flask to manage the blue-green deployment flow and provide metadata around services.
- Worked on one-click deployment using Packer, Terraform, and Jenkins declarative pipeline for immutable infrastructure for non-production environments.
- Configured blue-green deployment for production and non-production environments using Jenkins and Groovy.
- Automated the deployment of the DevOps dashboard in ECS using Terraform, Jenkins, and Groovy.
Engineering Manager | DevOps
Housing.com
- Worked as head of DevOps for Housing.com, PropTiger, and Makaan.com, reducing the AWS bill by 45% for all three platforms. Improved average uptime from 99.86% to 99.99% in a year and led the Amazon Aurora migration using AWS DMS with a rollback strategy.
- Managed Kubernetes migration, logging, monitoring, alerting, security, cost, CI/CD, automation, uptime of all platforms, and beta and production environment issues. I took infrastructure to the next level.
- Planned and scaled infrastructure for Housing.com to suddenly handle ten times more traffic through some marketing campaigns with minimal cost in one week. Implemented observability through ELK stack, Jaeger, and OpenTelemetry.
- Built centralized logging using ELK with ElastAlert and Search Guard. Set up ELK APM for Java-based APIs in beta and production environments. In this activity, we re-architected how the logging part was being managed earlier.
- Set up disaster recovery (DR) for Housing.com and implemented AWS WAF and Cloudflare WAF for different platforms.
- Migrated PropTiger and Makaan.com's 40 APIs from EC2 to the Kubernetes platform with Kubernetes Operations (kOps), Prometheus, and Grafana in AWS.
- Worked with key tech stakeholders to obtain their requirements related to dependencies on DevOps, prioritize them, and implement them so that all the tech teams could complete the project smoothly. Worked on a proof of concept for HashiCorp Vault implementation for secret management.
- Implemented Istio with Kiali and Jaeger for microservice management. Also implemented Crossplane, a cloud-agnostic solution that centralized our operations and simplified resource provisioning.
- Resolved all security group issues reported by the trusted advisor in AWS for the three platforms. Troubleshot various production issues related to infrastructure and provided root cause analysis (RCA) and resolution.
Senior DevOps Consultant
LambdaTest
- Prepared the VMware vSphere environment. Set up networking to enable communication between Kubernetes nodes and the outside world. Installed a compatible Linux distribution (e.g., Ubuntu, CentOS) on each VM serving as a Kubernetes node.
- Installed a networking plugin. Installed a networking plugin compatible with Kubernetes version and deployment requirements (e.g., Calico, Flannel, Weave). Used kOps for a Kubernetes cluster on VMware vSphere.
- Created standardized VM templates or golden images with preconfigured operating systems, applications, and settings to accelerate VM provisioning and ensure consistency across deployments.
- Defined customization specifications or guest OS customization settings to automate the customization of VMs during the provisioning process, such as hostname, IP address, domain membership, and security settings.
- Worked on a self-service portal. Implemented a self-service portal or catalog where users can request and provision VMs based on predefined templates and resource allocation policies, reducing administrative overhead and improving user satisfaction.
- Configured resource reservation policies and quotas to allocate and limit the amount of CPU, memory, storage, and network resources available to individual VMs or user groups, ensuring fair resource distribution and preventing resource contention.
- Managed the lifecycle of VMs, including provisioning, deployment, monitoring, scaling, migration, retirement, and decommissioning, to optimize resource usage and minimize costs throughout the VM lifecycle.
- Configured high availability (HA) and fault tolerance (FT) settings to ensure VM availability and resilience against hardware failures or host outages by automatically restarting VMs on healthy hosts or maintaining duplicate VM instances.
- Contributed to storage provisioning. Provisioned and managed storage resources for VMs, including creating and allocating virtual disks, configuring storage policies and profiles, and integrating with storage management tools for data protection and replication.
- Configured network settings for VMs, including virtual network interfaces, VLANs, IP addresses, DNS settings, firewall rules, and network security policies, to ensure secure and efficient communication within the private cloud environment.
Senior DevOps Engineer
Delhivery Pvt
- Designed, standardized, and implemented the VPC architecture, directory structure for IaC with Ansible and Terraform, and DevOps best practices for cost, security, and architecture across projects and organizations.
- Built a model project to be followed by other projects with a new design that was created. Managed the infrastructure and deployment automation for 15 microservices, including new and old services.
- Designed, standardized, and managed S3 bucket and CloudFront infrastructure automation using Terraform and deployment using Jenkins for 30 front-end dashboards. Deployed using Jenkins for almost 100 Lambda functions using serverless.
- Reduced cost of AWS by 30% using a combination of reserved instances (RIs), spot servers using Spot.io, cleanup of unused resources, and right-sizing of EC2, ElastiCache, and Relational Database Service (RDS).
- Defined policies for onboarding new joiners and access management for DevOps tools, including AWS, CloudAMQP, Cloud MongoDB, New Relic, Sentry, BitBucket, and Jenkins.
- Designed and managed the Jira integration for all projects and handled URL monitoring, including SLA and response time, using Zabbix for internal and external URLs.
- Performed troubleshooting on various production issues related to Lambda and EC2 and provided RCA and resolution to those.
DevOps Engineer
1mg
- Migrated the production infrastructure from a Java-based monolith application to a microservices-based architecture. Set up staging, QA, and development environments for 1mg.com and 1mglabs.com and managed them perfectly.
- Redesigned the infrastructure orchestration using Ansible to deploy over 50 microservices on different environments and set up load balancing using ELB and Autoscaling in just three days.
- Monitored the complete app infrastructure using CloudWatch and set up notifications using SNS. I was the single point of contact for day-to-day tasks regarding automation, Nginx web server, SSL, staging, QA, and development environment issues.
- Wrote many Shell and Python scripts to automate our day-to-day tasks using AWS SDK Boto3. Used Python's library, Troposphere, to automatically build our CloudFormation scripts for different environments.
Experience
FinOps | Inform, Optimize, and Operate
https://aws.amazon.com/blogs/aws-cloud-financial-management/tag/finops/I have over eight years of experience optimizing AWS cloud costs by up to 50% for multiple companies. I am an expert in servers, managed databases, and right-sizing.
There are three phases for FinOps that I operate:
1. The information phase gives the business complete visibility.
2. The optimization phase kick-starts the savings.
3. The operation phase makes cost optimization part of the business culture.
Infrastructure as Code (IaC)
https://www.delhivery.com/I built a fully automated federated role-based cross-account IAM access through SAML for all employees based on their designation. Worked on one-click deployment using Packer, Terraform, and Jenkins declarative pipeline for immutable infrastructure for non-production environments.
Cloud Security Guardrails and Best Practices
https://www.sequoiacap.com/1. Managing open security groups.
2. Using public S3 buckets.
3. Leading WAF set up, including Cloudflare, Akamai, AWS WAF, and Shield.
4. Creating disaster recovery and business continuity plans.
5. Setting up AWS SSO, Cognito, and enabling SAML authentication.
6. Building Attribute-based Access Control (ABAC).
7. Using secret management using HashiCorp Vault and AWS Secrets Manager
8. Migrating from IAM users to IAM roles with the least privileges
9. Defining the SCP policies for organization units and accounts
10. Defining boundary policies for IAM users and roles
Managed the set up and configuration of multiple AWS security services, including vulnerability management with Inspector, security alerts, threat detection using GuardDuty, evaluation configurations, and incident response with Detective.
Containerization | Migration from EC2 to EKS for a Microservices Architecture
https://kubernetes.io/I also migrated PROPTIGER's and makaan's back-end 40 APIs from EC2-based deployment to the Kubernetes platform with Kubernetes Operations (Kops), Prometheus, and Grafana in AWS.
Cloud Migrations | Applications, Databases, and Containerization
https://www.sequoiacap.com/I have managed the following types of migrations:
• VM and server-based deployments to containerized deployments
• Applications and databases from on-premise to a cloud
• Applications and databases from one cloud to another
• Applications and databases from one cloud account to another
• Applications and databases from one region to another
• Self-hosted application and databases to manage services
• Databases using AWS DMS
Observability and SRE | Centralized Logging and Monitoring Systems
https://housing.comI used the Prometheus stack, including Grafana, Alertmanager, and Loki, for monitoring in PropTiger. I also built on-call rotation and PagerDuty processes for multiple companies and helped them improve the uptime and reliability of their applications.
I defined SLI, SLO, SLA, and error budgets for applications and architected end-to-end traceability into the systems to implement them with the respective development teams. Finally, I performed database parameters PostgreSQL and MySQL tuning for multiple companies.
Sticker | Python CLI Tool AWS Tagging Governance
https://www.sequoiacap.com/Helped with Tagging governance, cloud infrastructure architecture, cost optimization, and cloud security best practices for Sequoia Capital portfolio Asia-based companies, including Pentester Academy, Checkbox, Enterpret, and FlowAccount.
Education
Bachelor's Degree in Information Technology
College of Engineering (COER), Roorkee - Uttarakhand, India
High School Degree in Physics, Chemistry, and Mathematics
Janta Inter College Rudrapur - Uttarakhand, India
Certifications
Certified Platform Administrator Associate | CloudHealth
VMWare
Skills
Libraries/APIs
Terragrunt, Node.js, Thanos
Tools
Terraform, Jenkins, ELK (Elastic Stack), AWS IAM, Amazon CloudWatch, Git, HashiCorp, Amazon Virtual Private Cloud (VPC), Boto 3, Amazon CloudFront CDN, PyCharm, Amazon Elastic Container Registry (ECR), Amazon Firewall, Observability Tools, Google Kubernetes Engine (GKE), VPN, Azure Key Vault, Ansible, Jira, Kubernetes HorizontalPodAutoscaler (HPA), AWS CodeCommit, Chef, Helm, Shell, Apache Maven, AWS CodeBuild, NGINX, Amazon Elastic MapReduce (EMR), AWS ELB, RabbitMQ, Apache Solr, Grafana, Kibana, SaltStack, Vault, GitHub, Packer, Amazon Cognito, Amazon Elastic Container Service (ECS), Amazon EKS, Amazon Elastic Block Store (EBS), Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Fluentd, AWS CloudFormation, Bitbucket, Istio, SonarQube, GitLab CI/CD, Docker Hub, GitLab, AWS CodeDeploy, Azure Kubernetes Service (AKS), Azure App Service, Closure Compiler, Amazon ElastiCache, PyPI, AWS Glue, VMware
Languages
Python 3, Java, Python, Bash, Python 2, Bash Script, Groovy, JavaScript
Paradigms
DevOps, Management, DevSecOps, Serverless Architecture, Automation, Microservices, DDoS, Event-driven Architecture, API/Services Architecture, Role-based Access Control (RBAC), Lambda Architecture, Continuous Integration (CI), Continuous Deployment, Unit Testing, Enterprise Application Architecture, Continuous Delivery (CD), Azure DevOps, Microservices Architecture
Platforms
Kubernetes, Amazon Web Services (AWS), Amazon EC2, AWS ALB, Linux, Docker, CentOS, Kubeflow, AWS Lambda, OpenStack, Google Cloud Platform (GCP), WordPress, Apache Kafka, Azure, Jupyter Notebook
Industry Expertise
Project Management, Network Security
Frameworks
Flask, Ruby on Rails (RoR), Django, Crossplane
Storage
PostgreSQL, Amazon Aurora, Redis Cache, MongoDB, MySQL, Database Architecture, Amazon S3 (AWS S3), Elasticsearch, AWS Elastic File System, Amazon DynamoDB, Redis, Azure Cosmos DB
Other
Cost Management, Governance, Amazon RDS, Cost Reduction & Optimization (Cost-down), Cloud, Containerization, Infrastructure as Code (IaC), Security, Identity & Access Management (IAM), Container Orchestration, Relational Database Services (RDS), IT Project Management, IT Projects, Agile DevOps, Networking, Cloud Computing, Architecture, CTO, Web Applications, Elastic Load Balancers, Linux Server Administration, DevOps Engineer, Monitoring, Lambda Functions, System Administration, Amazon Route 53, Content Delivery Networks (CDN), SSL, Scaling, System Architecture, Scalability, Growth, Document Management Systems (DMS), HTTP, Argo CD, Solution Architecture, Cloud Infrastructure, Compliance, Cloud Security, CI/CD Pipelines, Team Leadership, GitHub Actions, DNS Debugging, Load Balancers, Webhooks, Telemetry, Shell Scripting, Web Security, Enterprise Architecture, DNS, API Gateways, Large Scale Distributed Systems, APIs, Machine Learning Operations (MLOps), Google BigQuery, Azure Cloud Security, Site Reliability Engineering (SRE), Cloud Architecture, Jira Administration, Serverless, Cloudflare, Prometheus, Kubernetes Operations (kOps), OpenTelemetry, AWS Cloud Architecture, Cloud Migration, Migration, Server Migration, Elastic APM, Grafana 2, Single Sign-on (SSO), Amazon GuardDuty, Amazon Inspector, AWS Security Hub, AWS DevOps, Pulumi, AWS CodePipeline, Amazon API Gateway, AWS Transit Gateway, Service Meshes, Gunicorn, Data Analytics, Azure Virtual Machines, Root Cause Analysis, GitOps, Ingress, Teleport, Zoho, Firewalls, AWS Cloud Security, Jaeger, Kiali
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring