
Alexandre Souza
Verified Expert in Engineering
DevOps Engineer and Developer
Campinas - State of São Paulo, Brazil
Toptal member since June 7, 2019
Alexandre is a senior site reliability engineer (SRE) and AWS Solutions architect with 15+ years of experience. He leads large-scale cloud transformations, enhancing operational efficiency, reliability, and cost savings. Specializing in AWS, Kubernetes, Terraform, and DevOps, he excels in implementing robust security measures for SOC 2 compliance and streamlining deployment times. Alexandre is a leader committed to architecting scalable cloud solutions and fostering team growth and success.
Portfolio
Experience
- DevOps - 15 years
- Site Reliability Engineering (SRE) - 5 years
- AWS CloudFormation - 5 years
- Terraform - 5 years
- AWS Certified Solution Architect - 3 years
- Amazon EKS - 1 year
Availability
Preferred Environment
Cloud9, PyCharm, IntelliJ IDEA, Visual Studio Code (VS Code), Git, Linux, Amazon Web Services (AWS)
The most amazing...
...project I've implemented was an infrastructure-as-code (IaC) solution that decreases the environment creation time from one week to 45 minutes.
Work Experience
AWS Platform Engineer
SumerSports
- Architected end-to-end Databricks deployment on AWS using Terraform, automating workspace provisioning, implementing custom authentication, and reducing set-up time by 80%.
- Designed and implemented dynamic infrastructure generation using Terramate, enabling consistent deployment across 50+ environments/stacks while reducing code duplication by 70%.
- Enhanced GitHub Actions CI/CD pipelines for application and infrastructure deployment, achieving a 99.9% automation success rate and reducing deployment time from hours to minutes.
- Developed automated Terraform drift detection system using GitHub Actions, ensuring 100% infrastructure compliance and reducing configuration drift incidents by 95%.
- Created a comprehensive IaC framework using Terramate stacks for complete AWS organization management.
- Implemented multi-account architecture with uniform security controls.
- Configured enterprise networking utilizing virtual private cloud (VPC), transit gateway, and VPC peering.
- Implemented database infrastructure (RDS) with automated backup and failover mechanisms.
- Configured Kubernetes clusters with automated scaling and security policies.
- Implemented centralized logging and audit infrastructure utilizing CloudTrail.
DevSecOps Lead
Stephen M Childers
- Led implementation of Google Cloud Speech AI integration for mobile applications, enabling user voice recognition capabilities and improving user engagement.
- Developed proof-of-concept solution for Google Cloud Speech APIs, reducing integration complexity and validating feasibility for enterprise-scale voice processing requirements.
- Created comprehensive technical documentation and training program for the development team, reducing API integration time and enabling successful deployment of voice features across multiple app versions.
Cloud Engineer
Syngenta
- Authored comprehensive technical analysis of Aurora MySQL Serverless performance bottlenecks, delivering optimization strategies that improved database performance and reduced costs.
- Led security assessment of EKS infrastructure and application deployments, implementing recommendations that enhanced pod security posture by 80% and achieved zero critical vulnerabilities in production.
- Designed innovative API rate limiting solution combining AWS WAFv2, CloudFront functions, and JWT token validation, securing high-traffic applications of 100,000+ daily requests while maintaining sub-100ms latency.
- Architected scalable integration between API Gateway, Lambda functions, and Kubernetes services using WAFv2, reducing unauthorized access attempts by 95% while enabling granular traffic control.
SRE | Tour/Activity Tool Service Provider
FareHarbor
- Architected and implemented enterprise-wide cloud infrastructure using Terraform and Python, reducing operational overhead by 60% and achieving 99.99% infrastructure reliability.
- Developed comprehensive security-focused Terraform modules ecosystem for AWS services (WAFv2, Config, Backup, GuardDuty, and Firehose), enabling standardized security controls across 100+ applications.
- Established centralized security and audit infrastructure, including CloudTrail, password management, event notification, and Athena-based security analytics, achieving SOC 2 compliance requirements.
- Optimized a containerization strategy by reducing Docker image sizes by 70% and implementing multi-architecture support, improving deployment efficiency and reducing cloud costs by $150,000 annually.
- Managed mission-critical Kubernetes clusters serving 1+ million daily users, maintaining 99.999% uptime through advanced automation and monitoring.
- Implemented robust WAFv2 security architecture, including custom rule groups and IP management, blocking 99.9% of malicious traffic while maintaining zero false positives.
- Led observability infrastructure managing TeamCity, New Relic, Splunk, ELK Stack, and Sentry, reducing MTTR by 75% through centralized logging and monitoring.
DevOps Lead | Migration Expert
Veea (via Toptal)
- Led large-scale containerization initiative, migrating 50+ applications from Mesos/Marathon and EC2 to Amazon EKS (Kubernetes), resulting in 60% improved resource utilization and 40% reduced deployment time.
- Directed team of SRE and DevOps engineers in implementing a comprehensive monitoring strategy, achieving 90% faster incident detection and 99.99% service availability through enhanced metrics and intelligent alerting.
- Architected enterprise-wide secrets management migration to AWS SecretManager, eliminating security risks of Git-stored configs and achieving SOC 2 compliance requirements.
- Developed standardized Jenkins Pipeline Libraries using convention over configuration principles, reducing pipeline maintenance by 70% and enabling self-service deployments across all environments.
- Drove cloud cost optimization initiatives resulting in $200,000+ annual savings through strategic instance right-sizing and service optimization.
- Implemented automated security scanning and compliance monitoring, reducing security vulnerabilities by 80% and achieving zero critical findings in production.
- Enhanced application availability by designing multi-region Kubernetes deployment strategies with automated failover, achieving 99.999% uptime across critical services.
- Served as a principal technical consultant for DevSecOps, Kubernetes, and AWS, enabling the successful delivery of 30+ new applications and services across multiple teams.
AWS Cloud Formation Expert
BAZZE & COMPANY (via Toptal)
- Managed critical CloudFormation infrastructure powering 100+ AWS resources, ensuring 99.9% uptime and consistent configuration management across production environments.
- Spearheaded migration to AWS CDK, automating infrastructure provisioning for 50+ services and reducing deployment time by 70% through TypeScript-based infrastructure definitions.
- Led comprehensive infrastructure modernization initiative by migrating legacy resources to CDK, achieving 100% infrastructure as code coverage and reducing configuration drift to near zero.
- Implemented centralized IaC repository and resource inventory system, improving resource tracking efficiency by 80% and enabling automated compliance monitoring across all AWS accounts.
Senior CI/CD Engineer
Code Particle (via Toptal)
- Architected and maintained enterprise CloudFormation templates managing $1+ million annual cloud infrastructure, achieving a 99.99% successful deployment rate and 60% faster resource provisioning.
- Modernized an application deployment strategy by implementing rolling updates and blue/green deployments, reducing deployment downtime by 90% and enabling four times more frequent releases with zero customer impact.
- Led migration of legacy infrastructure to Terraform, successfully importing and managing 200+ AWS resources while implementing standardized IaC practices across the organization.
- Designed and implemented automated MongoDB Atlas cluster provisioning using Terraform, reducing database deployment time from days to minutes and ensuring consistent configuration across 20+ environments.
Systems Engineer
Benetech (via Toptal)
- Managed and optimized chef infrastructure automation across 100+ EC2 instances, implementing best practices that reduced configuration drift by 85% and improved system reliability.
- Streamlined Nginx configurations for high-traffic web applications, achieving 40% improved response times and 99.99% uptime across production environments.
- Redesigned CI/CD pipelines and implemented automated testing and deployment strategies, reducing deployment failures by 75% and cutting deployment time from hours to minutes.
- Enhanced AWS infrastructure observability by developing a comprehensive CloudWatch monitoring system, including custom alarms and dashboards that reduced mean time to recovery (MTTR) by 60% and improved incident response.
Systems Architect
Daitan
- Designed and implemented enterprise-grade CI/CD pipeline for front-end applications, reducing deployment time by 70% and enabling 200+ successful deployments monthly.
- Architected multi-cloud infrastructure provisioning system supporting AWS and GCP, leveraging Python, AWS CloudFormation, and custom deployment tools to ensure 99.9% infrastructure reliability.
- Led the development of scalable IaC solutions using Terraform, resulting in 80% faster environment provisioning and standardized infrastructure across 50+ services.
- Provided technical leadership and mentorship to a team of five SRE and DevOps engineers, implementing performance metrics and coaching programs that reduced incident response time by 40% and improved feature delivery velocity.
- Created a comprehensive technical documentation framework and training program for SRE and DevOps teams, creating structured learning paths that reduced onboarding time from six weeks to two weeks and increased team expertise across all skill levels.
Systems Specialist
iFood
- Maintained critical systems, fixing performance and stability problems.
- Created a blue/green deployment process and application for the AWS-hosted company systems with configuration options for canary deployment.
- Led the API Gateway implementation initiative based on the Kong solutions. It involved all development teams, where I was responsible for propagating the initiative's benefits and managing each team's delivery schedule.
- Planned and executed load and stress tests in company applications to verify bottleneck points and performance improvement.
- Maintained Terraform-based infrastructure as code (IaC) solutions.
- Maintained Chef-based configuration management solutions.
Systems Architect
Daitan
- Led cross-functional teams in adopting DevOps culture and implementing best practices related to infrastructure as code (IaC) development quality and decreased deployment time.
- Automated DevOps procedures by creating applications for business rules handling, AWS CloudFormation, deployment manager (GCP) template generation, and Cloud environment orchestration.
- Optimized the productivity from a 5-day environment creation with several script calls done manually to it, taking 45 minutes with the call of a single script that orchestrates the entire process.
- Spearheaded the creation of Jenkins pipelines for unit and integration tests of both the environment creation procedures and infrastructure testing of the environments, allowing infrastructure-as-a-code acceptance tests.
- Implemented Docker environments to enable more parallelized infrastructure testing.
- Executed high-availability infrastructure migration projects (all services ran in all instances) to a clustered environment (each service runs on its own set of machines), using SaltStack as the configuration manager.
- Fixed problems in legacy infrastructure scripts/procedures.
- Architected AWS and GCP cloud resources usage in projects for new environment features.
- Developed improvements in cloud environments to improve performance and optimize costs.
Senior Performance Analyst
Inmetrics S/A
- Oversaw software incident analysis and created root-cause reports.
- Developed and installed monitoring solutions based on Zabbix, customizing scripts and plugins to provide custom monitoring.
- Worked on operating systems: Windows Server, Linux, HP-UX, and IBM AIX.
- Worked on application servers: JBoss EAP 5 and 6, WebSphere 5 and 6, WebLogic 10 and 11.
Systems Architect
Lumis EIP
- Designed and implemented infrastructure solutions for network administrators (Windows Server 2003 and 2008, Linux and Solaris), DBAs (SQL Server 2000, 2005 and 2008, Oracle 9i, 10g and 11g, DB2 and MySQL 5).
- Designed and implemented infrastructure solutions for web administrators (Java/Tomcat, JBoss, WebSphere, WebLogic, and IIS).
- Developed web solutions, automated procedures, and statistic reports using Java, Groovy, Python, Ruby, JavaScript, SQL, Shell, and .NET Core.
- Improved batch and web solutions for high availability and performance requirements.
- Administrated Lumis Portal CMS. Built and improved software solutions on Windows and Linux operating systems, Java server and HTTP servers, cache servers, and other back-end applications at the r7.com website.
- Built provision, orchestration, and deployment solutions for Linux and Windows servers, using build and configuration management software like Chef, Puppet, Capistrano, Ant, Maven, Jenkins, and Nexus.
- Designed high availability and scalable cloud solutions at AWS and Azure.
- Administrated Lumis Portal CMS. Handled upgrades, code bugs, WebLogic, and Oracle 11 problems.
- Administrated Lumis Portal CMS. Established development process. Handled upgrades, code bugs, WebSphere, JBoss, networking, Oracle, and PostgreSQL problems. Improved build and deployment management at SulAmérica.
Experience
Pod Provision
Monitoring Project
• Servers (using auto-deploy)
• Networking (using network devices auto-discovery)
• VoIP
• Links
• CloudWatch integration
API Gateway
Blue/Green Deployment
Education
Master's Degree in Computer Engineering
Universidade Estácio de Sá - Rio de Janeiro, Brazil
Certifications
AWS Certified Solutions Architect – Professional
Amazon Web Services Training and Certification
AWS Certified Solutions Architect – Associate
Amazon Web Services Training and Certification
Skills
Libraries/APIs
jQuery, Hystrix, AWS Amplify
Tools
Amazon Elastic Container Service (ECS), Jenkins, Apache HTTP Server, NGINX, Boto, Boto 3, AWS CLI, AWS ELB, AWS SDK, Amazon Elastic Container Registry (ECR), Google Compute Engine (GCE), Google Kubernetes Engine (GKE), Amazon EKS, Amazon Simple Email Service (SES), AWS CloudFormation, Amazon CloudFront CDN, Terraform, SaltStack, Amazon Elastic Block Store (EBS), GitLab CI/CD, MongoDB Atlas, Grafana, Git, IntelliJ IDEA, PyCharm, Chef, Apache Maven, Zabbix, Kong, Bitbucket, Amazon Virtual Private Cloud (VPC), AWS IAM, AWS Cloud Development Kit (CDK), AWS Glue, AWS Batch, AWS Fargate, AWS AppSync, CircleCI, TeamCity, Istio
Languages
Python 3, Java, JavaScript, TypeScript, SQL, Python, Bash, Perl
Frameworks
Flask, AWS HA, Django, Spring, Spring Core, Spring Boot, Ant Design, JSON Web Tokens (JWT)
Paradigms
DevOps, Automated Testing, Continuous Integration (CI), Agile, Scrum, Kanban, DevSecOps
Platforms
Amazon EC2, Amazon Web Services (AWS), Kubernetes, Java EE, AWS Lambda, Linux, Visual Studio Code (VS Code), Windows Server, Nexus, AIX, JBoss EAP, WebSphere, Google Cloud Platform (GCP), Docker, AWS ALB, AWS NLB, Opsgenie, HP-UX, AWS Security Token Service (STS), AWS IoT, New Relic
Storage
Amazon S3 (AWS S3), Amazon EFS, PostgreSQL, Amazon DynamoDB, Datadog, SQL Server 2000, SQL Server 2005, SQL Server 2010, SQL Server 2008 R2, Amazon Simple Workflow Service (SWF), AWS Snowball
Other
CI/CD Pipelines, Sanic Web Server, Cloud9, AWS DevOps, Site Reliability Engineering (SRE), Cloud, AWS Certified Solution Architect, Identity & Access Management (IAM), Cloud Services, Cloud Architecture, Cloud Infrastructure, Containers, GitHub Actions, Spring Cloud, WebLogic, Cloud Security, GRC, SOC 2, Networking, Content Management, Telephony, ECS, Amazon Route 53, Slackbot, AWS SSH Keys, AWS Secrets Manager, AWS VPN, Relational Database Services (RDS), Web Application Firewall (WAF), Amazon RDS, APM, Application Security, Web Security, Software Testing Lifecycle (STLC), Data Feeds, Web Platforms, Terramate
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring