Guido is available for hire

Guido Iaquinti

Verified Expert in Engineering

Software Developer

Location

Barcelona, Spain

Toptal Member Since

September 5, 2018

Guido is a system engineer with an academic background and experience in high-volume/high-availability internet architectures. He's a technology enthusiast excited about open-source software. His passion is to develop, scale, and automate complex systems.

Linux Cloud Architecture Prometheus Git Python Amazon Web Services (AWS)MySQL Terraform Memcached Redis Elasticsearch Docker DevOps Grafana Google Cloud Platform (GCP)

Portfolio

PostHog

Amazon Web Services (AWS), Python, Kubernetes, Google Cloud Platform (GCP)...

Slack Technologies

Amazon Web Services (AWS), Python, Redis, Memcached, MySQL, Go, Linux, Git...

Slack Technologies

Amazon Web Services (AWS), Redis, Memcached, MySQL, Python, Go, Linux...

Experience

Linux - 20 years Python - 15 years Git - 15 years Amazon Web Services (AWS) - 10 years MySQL - 10 years Terraform - 8 years Google Cloud Platform (GCP) - 5 years Memcached - 5 years

Availability

Full-time

Preferred Environment

Linux, Git, Cloud

The most amazing...

...thing I've done was to build Slack's engineering presence in EMEA. In the first year, I effectively operated Slack by myself for nine hours a day.

Work Experience

Software Engineer

2021 - 2023

PostHog

Redesigned and rebuilt the infrastructure from the ground up to adhere to operation and security best practices.
Designed and built PostHog Cloud EU, our GDPR-compliant SaaS.
Consolidated our self-hosted solution and PostHog Cloud deployments, migrating from AWS ECS to Kubernetes without downtime.
Helped to set the company's technical direction by advocating and implementing industry best practices regarding code standards, observability, security, and performance. Mentored and sponsored cross-team initiatives.

Technologies: Amazon Web Services (AWS), Python, Kubernetes, Google Cloud Platform (GCP), Azure, PostgreSQL, Redis, Terraform, Linux, Git, Prometheus, Team Leadership, Cloud Architecture, CI/CD Pipelines, Docker, DevOps, Site Reliability Engineering (SRE), Monitoring, Grafana, Cloud

Senior Staff Engineer | Engineer (Contract)

2019 - 2021

Slack Technologies

Worked with the datastore team member and tech lead for the distributed databases system and caching tier.
Co-led the architectural design to build and deploy GovSlack datastore systems. GovSlack runs in GovCloud-certified data centers and complies with the following security standards: FedRAMP High, ITAR, FIPS 140.2, and DOD IL4.
Participated as the tech lead of the International Data Residency program. The architecture was designed, built, and deployed in record time. Data residency gives global teams more control over where customer data is stored.
Acted as the speaker at international events and conferences (KubeCon + CloudNativeCon NA 2019, PerconaLive Europe 2019, KubeCon + CloudNativeCon NA 2020, recruiting events, and meetups).

Technologies: Amazon Web Services (AWS), Python, Redis, Memcached, MySQL, Go, Linux, Git, Terraform, Prometheus, Team Leadership, Cloud Architecture, CI/CD Pipelines, Docker, DevOps, Site Reliability Engineering (SRE), Monitoring, Grafana, Cloud

Senior Engineer | Staff Engineer

2016 - 2018

Slack Technologies

Built Slack's engineering presence in EMEA.
Operated Slack nine hours a day in the first year and handled CE escalations, on-call pages for the entire infrastructure during EMEA hours, and incident command.
Tasked with the storage infrastructure: databases, caching systems, queues, and search.
Worked as a lead infrastructure engineer for Slack's next-generation database system based on Vitess (now a CNCF project). I drove the operational work needed to turn an innovative, poorly documented open source project into a system we rely on in production.
Experimented with a brand new EC2 hardware platform that provided fast NVMe storage. I dove into bugs that crashed servers and corrupted data and engaged AWS support directly to resolve these bugs and tune performance.
Worked as a key advocate for stronger data consistency in MySQL. This involved getting consensus and deploying strict SQL mode, semi-synchronous and row-based replication.
Improved database replication visibility by developing and deploying custom tools across our database fleet; our replication-related pages decreased by 58% in two quarters.
Worked as the lead infrastructure engineer for Slack's distributed service discovery, lock system, and KV store.
Worked as the lead infrastructure engineer for an internal distributed system powering features like search suggestions, team statistics, billing, etc.
Acted as the speaker at international events and conferences (PerconaLive 2018, DevOpsDays Tel Aviv 2017, recruiting events, and meetups).

Technologies: Amazon Web Services (AWS), Redis, Memcached, MySQL, Python, Go, Linux, Google Cloud Platform (GCP), Git, Terraform, Prometheus, Team Leadership, Cloud Architecture, CI/CD Pipelines, Docker, DevOps, Site Reliability Engineering (SRE), Monitoring, Grafana, Cloud

Site Reliability Engineer

2014 - 2016

Microsoft

Worked in the cloud and enterprise (C+E) division, ensuring that complex internet-facing systems were healthy, monitored, automated, and designed to scale.
Managed the overall health, performance, and capacity of one of Microsoft's biggest open source environments.
Worked with a deep knowledge of the application and product, problem-solving from the nee way through the application stack.

Technologies: Redis, Elasticsearch, Python, Azure, Linux, Git, Terraform, Team Leadership, Cloud Architecture, CI/CD Pipelines, Docker, DevOps, Site Reliability Engineering (SRE), Monitoring, Grafana, Cloud

Experience

Scaling Datastores at Slack with Vitess

https://slack.engineering/scaling-datastores-at-slack-with-vitess/

From the very beginning of Slack, MySQL was used as the storage engine for all our data. Slack-operated MySQL servers in an active-active configuration. This is the story of how we changed our data storage architecture.

Codename VIFL | How to Migrate MySQL Database Clusters to Vitess

https://github.com/guidoiaquinti/guidoiaquinti/tree/main/presentations

KubeCon 2020, virtual conference.

Have you ever considered migrating a database system at scale with no downtime? Many of us that have tried often find it an insurmountable challenge for developers and database engineers. In this talk, Rafael and Guido will discuss how they designed and built a migration framework and then executed it to move petabytes of data to Vitess with zero downtime.

Scaling Resilient Systems | A Journey Into Slack's Database Service

https://github.com/guidoiaquinti/guidoiaquinti/tree/main/presentations

KubeCon 2019, San Diego, CA, USA.

Monitoring and observability are essential concepts, especially in complex and distributed systems. Redundancy and defensive programming are also necessary, but sometimes they are not enough. Designing systems to minimize the blast radius when the unexpected happens is often the key.

In this talk, Rafael and I will share an overview of how Slack was designed, built, scaled, and then iterated to improve its distributed database service based on top of Vitess, now a CNCF project. The databases team at Slack rose a Vitess cluster from 0 to spikes of 2.7 million queries per second. This journey has taught us how to operate a database cluster with more than 2.000 nodes, and we expect growth to more than 3,500 in the next 12 months.

Strength in Numbers | Slack’s Database Architecture

https://github.com/guidoiaquinti/guidoiaquinti/tree/main/presentations

PerconaLive Europe 2019, Amsterdam, The Netherlands.

Traditionally, database reliability has focused heavily on the stability of a single server or a small number of servers. At Slack, however, we've built a database architecture that instead focuses on an approach of strength in numbers. This talk will review the architecture we've made, specifically focusing on our Vitess infrastructure.

Designing and Launching the Next-generation Database System at Slack | From Whiteboard to Production

https://github.com/guidoiaquinti/guidoiaquinti/tree/main/presentations

PerconaLive 2018, Santa Clara, USA.

Slack is a messaging platform for teams that brings all types of communication together, creating a single unified archive accessible through powerful search.

MySQL is the primary storage for all our customer data, and we currently execute billions of transactions per hour. As more users join the service and Slack becomes a more critical part of their workflow, the system becomes more complicated to manage. What started as a simple MySQL database was only the starting point for redesigning our entire database infrastructure.

This talk analyzes how our operations team took Vitess, a bleeding edge, poorly-documented open-source software developed by Google, and then hardened, tested, and shaped it for our infrastructure and hosted all our mission-critical data. This presentation will consider the technical challenges we faced to deploy this project successfully, the key decisions we took, what went well, what didn't, and the course correction we made along the way.

Attendees can expect to hear details about how we took some whiteboard conversations and turned them into battle-tested, production-caliber systems.

Distributed Teams | Scaling Operations Around the World

https://github.com/guidoiaquinti/guidoiaquinti/tree/main/presentations

DevOpsDays 2017, Tel Aviv, Israel.

The journey of a small operations team in a fast-growing environment is always challenging. This talk describes Slack's solutions to many of our technical and cultural challenges as we scale our technical operations team worldwide.

We analyzed key decisions, then dug into what went well and what lessons we learned along the way. Attendees can expect to hear details about issues such as handoff between time zones, partnering with software engineers for significant projects, and dealing with long-running incidents.

Skills

Paradigms

DevOps

Platforms

Linux, Docker, Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP), Kubernetes

Other

CI/CD Pipelines, Site Reliability Engineering (SRE), Monitoring, Prometheus, Vitess, Team Leadership, Cloud Architecture, Cloud

Languages

Python, Go

Tools

Git, Terraform, Grafana

Storage

MySQL, Elasticsearch, Memcached, Redis, PostgreSQL, Databases

Education

2009 - 2012

Bachelor's Degree in Telecommunications Engineering

Università degli Studi di Genova - Genoa, Italy

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring