Saad Ali, Developer in Lahore, Punjab, Pakistan
Saad is available for hire
Hire Saad

Saad Ali

Verified Expert  in Engineering

DevOps Engineer and Developer

Location
Lahore, Punjab, Pakistan
Toptal Member Since
October 3, 2022

With 11+ years of experience in multiple roles, Saad is passionate about designing, deploying, and managing scalable, reliable, and secure cloud infrastructure for web applications and services. He has extensive experience with AWS, CloudFormation, Terraform, Ansible, Jenkins, ArgoCD, Docker, and Kubernetes, as well as various tools and frameworks for monitoring, logging, security, and automation. Saad is also CKAD certified. He ensures that he always follows best practices while working.

Portfolio

Arbisoft
Amazon CloudWatch, Amazon EC2, Amazon EKS, Amazon Route 53, Amazon S3 (AWS S3)...
Arbisoft (Pvt) Ltd
Shell Scripting, Python, Bash, Amazon Web Services (AWS), AWS CloudFormation...
Unifonic
Amazon EC2, Amazon EKS, Amazon RDS, Amazon Route 53, Ansible...

Experience

Availability

Part-time

Preferred Environment

Terraform, Linux, Amazon Web Services (AWS), NGINX, Bash, Ansible, AWS CloudFormation, Kubernetes, Amazon EKS, Argo CD

The most amazing...

...thing I've contributed to is the open source project Argo CD helm chart with a ConfigMap that allows you to add the ConfigManagementPlugin configuration.

Work Experience

Senior Site Reliability Engineer

2024 - PRESENT
Arbisoft
  • Provided DevOps best practices and recommendations to Edly projects.
  • Used Tutor for deploying the Open edX platform in a containerized environment.
  • Used Tutor for deploying the Open edX platform in a containerized environment.
Technologies: Amazon CloudWatch, Amazon EC2, Amazon EKS, Amazon Route 53, Amazon S3 (AWS S3), Amazon Web Services (AWS), Ansible, Apache2, AppArmor, Terraform, AWS CloudFormation, Kubernetes, Argo CD, Argo Workflows, Jenkins, Jenkins Job DSL, Jenkins Pipeline, GoCD, Bash, Python, Docker, ELK (Elastic Stack), Fluentd, NGINX

DevOps Engineer and Team Lead | Site Reliability Engineer | Senior Site Reliability Engineer

2018 - PRESENT
Arbisoft (Pvt) Ltd
  • Built and led a team of 20 DevOps engineers between December 2018 and May 2021 before being promoted as a site reliability engineer at edX, 2U.
  • Decreased the Open edX deployment time from two weeks to two days through automation using Jenkins, Ansible, and Bash scripts.
  • Enforced a highly available, scalable, and self-healing infrastructure through CloudFormation for Open edX deployments.
  • Monitored Open edX deployments using CloudWatch, Datadog, and New Relic.
  • Reduced the dependency on Splunk licenses by utilizing the Elasticsearch, Fluentd or Fluent B, and Kibana stack for the log aggregation of Open edX and other Python applications.
  • Used Lambda functions in AWS for scheduled jobs on EC2s and routine cleanups of AMIs.
  • Implemented Jenkins Master with an EKS slave cluster.
  • Incorporated various automation tools to facilitate a CI/CD process for multiple projects at Arbisoft.
Technologies: Shell Scripting, Python, Bash, Amazon Web Services (AWS), AWS CloudFormation, Terraform, Elasticsearch, Kibana, Fluentd, Jenkins, Open edX, AppArmor, Auto-scaling Cloud Infrastructure, High Availability Disaster Recovery (HADR), Jenkins Pipeline, AWS Lambda, AWS CodeBuild, GitHub Actions, Kubernetes, Amazon EKS, Iptables, Django, Flask, Argo CD, Ansible, MongoDB, Amazon CloudWatch, Linux, MySQL, NGINX, DevOps, CI/CD Pipelines, Amazon S3 (AWS S3), Site Reliability Engineering (SRE), Amazon RDS, Amazon EC2, Amazon Route 53, AWS DevOps, Infrastructure as Code (IaC), Orchestration, Scalability, Load Balancers, Continuous Integration (CI), Continuous Delivery (CD), Docker, Ubuntu

Senior DevOps Engineer

2024 - 2024
Unifonic
  • Created a new RabbitMQ Helm chart that uses the RabbitMQ Operator to launch a new cluster in Kubernetes and manage custom RabbitMQ resources.
  • Completed EKS upgrade analysis to upgrade from 1.24 to 1.29.
  • Utilized Terraform to easily launch new Kubernetes cluster Node.js pools in the environment as needed.
Technologies: Amazon EC2, Amazon EKS, Amazon RDS, Amazon Route 53, Ansible, Amazon Web Services (AWS), Argo CD, Amazon S3 (AWS S3), Vault, Oracle Cloud, Python, Bash, Kubernetes, Prometheus, Amazon OpenSearch, Apache Kafka, Terraform, Drone CI, Apache Cassandra, Linux, RabbitMQ, Helm

Site Reliability Engineer

2021 - 2023
edX
  • Updated application code in open-source Open edX Git repositories to make the Django user management command more generic for other Open edX Django applications.
  • Containerized the in-house ChatOps application previously hosted on AWS Lambda to run it in a Kubernetes cluster.
  • Wrote an Argo CD deployment pipeline for ChatOps and other internal applications in GoCD.
  • Used the Jenkins Job DSL plugin with Groovy language to write Jenkins jobs in code.
  • Managed the infrastructure at scale using Terraform.
  • Reduced manual toil by working on other forms of automation using automated jobs with Jenkins or GoCD that run different scripts.
Technologies: Shell Scripting, Python, Bash, Amazon Web Services (AWS), Terraform, Splunk, Jenkins, Open edX, AppArmor, Auto-scaling Cloud Infrastructure, High Availability Disaster Recovery (HADR), Jenkins Job DSL, GitHub Actions, Kubernetes, Amazon EKS, Docker, Django, Argo CD, GoCD, Ansible, MongoDB, Amazon CloudWatch, Linux, MySQL, NGINX, Jenkins Pipeline, DevOps, CI/CD Pipelines, Amazon S3 (AWS S3), Site Reliability Engineering (SRE), Amazon RDS, Amazon EC2, Amazon Route 53, AWS DevOps, Infrastructure as Code (IaC), Orchestration, Scalability, Load Balancers, Continuous Integration (CI), Continuous Delivery (CD), Ubuntu

Senior Network Systems Engineer | Associate NOC Manager

2014 - 2016
Nextbridge (Pvt) Ltd
  • Led the team as an associate network operations center (NOC) manager in 2016.
  • Improved the inbound email reception of the company's self-hosted email service by implementing a backup mail exchange (MX).
  • Replicated the email storage to a secondary IMAP server in the same email system, removing a single point of failure.
  • Implemented anti-spoofing measures in the email system, effectively preventing email spoofing.
  • Automated FreePBX outbound call reports via Python to keep track of phone calls.
  • Deployed web applications in an auto-scaling environment using AWS Developer Tools.
  • Incorporated OpenVAS for external vulnerability assessment.
  • Used MySQL Galera replication in AWS for an eCommerce business required to keep data in the US, Germany, and Australia.
  • Increased the security of the eCommerce legacy application built in PHP through an AppArmor.
Technologies: SMTP, IMAP, High Availability Disaster Recovery (HADR), FreePBX, Python, Bash, Auto-scaling Cloud Infrastructure, Vulnerability Assessment, MySQL, Database Replication, AppArmor, Ansible, Amazon CloudWatch, Docker, Linux, Amazon Web Services (AWS), Apache2, NGINX, Postfix, Dovecot, Iptables, Amazon S3 (AWS S3), Amazon RDS, AWS DevOps, Scalability, Load Balancers

System Administrator

2013 - 2013
Happy Hosts
  • Managed, maintained, and troubleshot servers for shared and dedicated web and email hosting.
  • Utilized Puppet for automating administrative tasks.
  • Implemented MySQL replication for ISP services that depended on it, such as PowerDNS.
Technologies: Parallels Plesk Panel, Web Servers, PowerDNS, MySQL, Database Replication, Puppet, Linux, Apache2, NGINX, Iptables, Amazon S3 (AWS S3)

Associate Network Systems Engineer

2012 - 2013
Nextbridge (Pvt) Ltd
  • Implemented cloud infrastructure for various customers on AWS and Rackspace Cloud.
  • Enforced disaster recovery plans for the company and many of its customers.
  • Managed Subversion, Git, VMware ESXi and Proxmox hypervisors, VPN, LAMP or LEMP stacks, ROR, and email servers.
  • Utilized Nagios for monitoring company and customer nodes.
  • Executed MySQL Master-Master replication for a set of applications deployed in multiple physical locations.
Technologies: Amazon Web Services (AWS), MySQL, Apache2, LAMP Server, NGINX, VPN, Nagios, VMware ESXi, Proxmox, Rackspace Cloud, Database Replication, Postfix, Dovecot, Disaster Recovery Plans (DRP), Linux, Iptables, Amazon S3 (AWS S3), Amazon RDS, AWS DevOps

System Administrator

2012 - 2012
Self-employed
  • Increased a boot server's availability using Oracle Enterprise Linux 5.7 and Oracle Clusterware for diskless client machines.
  • Used Debian Linux kernel patched with the Kerrighed single-system image for the client machines.
  • Booted the diskless systems as part of a cluster—a single system leveraging the combined processing power of all machines to do various tasks. The set up was used in the research for breaking RSA encryption.
Technologies: Oracle Linux, Kerrighed SSI, High Availability Disaster Recovery (HADR), Linux, Amazon S3 (AWS S3)

System Administrator

2010 - 2011
CoreZee (Pvt) Ltd
  • Assisted development teams with resolving package dependencies on Linux and BSD systems.
  • Tested the software built atop Linux- or BSD-embedded system boards from Cavium Networks.
  • Built a testing scenario for an intrusion prevention system, a network security product.
  • Utilized the Scapy packet manipulation framework to build and test exploits on the Snort rule set.
  • Reduced the time it takes to package a release on a compact-flash card through Bash scripts.
Technologies: Shell Scripting, BSD, Intrusion Detection Systems (IDS), Intrusion Prevention Systems (IPS), Network Exploitation, IT Automation, Linux, Amazon S3 (AWS S3)

MIT Open Learning Library

https://openlearninglibrary.mit.edu/about
Deployed Open edX for the MIT Open Learning Library while working at Arbisoft. I wrote the entire infrastructure in CloudFormation and monitored it using CloudWatch. I deployed and scheduled jobs using the github.com/edx/configuration Ansible repository and Jenkins declarative pipelines.

Philanthropy University

https://www.philanthropyu.org/
Deployed Open edX with custom NodeBB integration for Philanthropy University while working at Arbisoft. I wrote the entire infrastructure in CloudFormation and monitored it using CloudWatch. I deployed and scheduled jobs using the github.com/edx/configuration Ansible repository and Jenkins declarative pipelines.

UC San Diego Online

https://online.ucsd.edu
Deployed Open edX for UC San Diego while working at Arbisoft. I wrote the entire infrastructure in CloudFormation and deployed and scheduled jobs using the github.com/edx/configuration Ansible repository and Jenkins declarative pipelines.
2009 - 2010

Master of Engineering in Communication Systems and Networks

Mehran University of Engineering and Technology - Jamshoro, Sindh, Pakistan

2005 - 2008

Bachelor of Engineering in Computer Systems

Mehran University of Engineering and Technology - Jamshoro, Sindh, Pakistan

MAY 2023 - MAY 2026

CKAD: Certified Kubernetes Application Developer

The Linux Foundation

Libraries/APIs

Jenkins Pipeline, Jenkins Job DSL

Tools

Jenkins, NGINX, Postfix, Ansible, AWS CloudFormation, Terraform, AWS CodeBuild, Iptables, VPN, Nagios, Parallels Plesk Panel, Puppet, FreePBX, AppArmor, Kibana, Fluentd, Amazon EKS, Splunk, Amazon CloudWatch, Vault, Amazon OpenSearch, RabbitMQ, Helm, ELK (Elastic Stack)

Paradigms

DevOps, Continuous Integration (CI), Continuous Delivery (CD)

Languages

Bash, Python

Platforms

Linux, Amazon Web Services (AWS), Open edX, Kubernetes, Docker, Amazon EC2, BSD, Oracle Linux, Apache2, Proxmox, Rackspace Cloud, AWS Lambda, Ubuntu, Apache Kafka, Drone CI

Frameworks

Django, Flask

Storage

MongoDB, Amazon S3 (AWS S3), MySQL, LAMP Server, Database Replication, Auto-scaling Cloud Infrastructure, Elasticsearch, Oracle Cloud

Other

CI/CD Pipelines, Scalability, Load Balancers, Networks, Shell Scripting, High Availability Disaster Recovery (HADR), Dovecot, GitHub Actions, Argo CD, GoCD, Site Reliability Engineering (SRE), Amazon RDS, Amazon Route 53, Infrastructure as Code (IaC), Orchestration, Programming, Intrusion Detection Systems (IDS), Intrusion Prevention Systems (IPS), Network Exploitation, IT Automation, Kerrighed SSI, VMware ESXi, Disaster Recovery Plans (DRP), Web Servers, PowerDNS, SMTP, IMAP, Vulnerability Assessment, AWS DevOps, Prometheus, Apache Cassandra, Argo Workflows

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring