Antonio Osorio, Developer in San Francisco, CA, United States
Antonio is available for hire
Hire Antonio

Antonio Osorio

Verified Expert  in Engineering

Engineering Management Developer

Location
San Francisco, CA, United States
Toptal Member Since
November 8, 2016

As a tech leader at Netflix, Amplitude, and Schrodinger, Antonio has driven key projects enhancing data processing, developer tools, and cloud infrastructure. His work includes revolutionizing Netflix's event handling, initiating Kubernetes at Amplitude, and scaling scientific simulations at Schrodinger. Focused on innovation, efficiency, and scalability, Antonio empowers teams to achieve technological excellence.

Portfolio

Netflix
Go, Java, Apache Kafka, Spring Boot, AWS IoT, GraphQL, Productivity, Big Data...
Amplitude
Productivity, Engineering Management, DevOps, AWS IoT, Terraform...
Amplitude
DevOps, Spinnaker, Continuous Integration (CI), Continuous Deployment, Git...

Experience

Availability

Part-time

Preferred Environment

Python, Java, Apache Kafka, Big Data, Continuous Integration (CI), Spring Boot, Go

The most amazing...

...system I've contributed to is Netflix's event telemetry ingestion pipeline. This system handles 12 million events and 100GB of event data per second.

Work Experience

Senior Software Engineer

2019 - PRESENT
Netflix
  • Handled Netflix's event telemetry ingestion pipeline. This system runs 12 million events and 100GB of event data per second.
  • Took charge of migrating the playback-related event telemetry from a legacy service based on synchronous gRPC calls to a new service based on asynchronous Kafka messages.
  • Created the GraphQL back end that exposed developer tooling to our federated graph edge. The new back end powers the "console" as a single pane of glass and entry point for all developer tools at Netflix.
  • Revamped the internal tool used to create new software projects. The tool generates projects that follow the "paved path" best practices, including continuous integration and deployment, observability, logging, and production readiness.
Technologies: Go, Java, Apache Kafka, Spring Boot, AWS IoT, GraphQL, Productivity, Big Data, Stream Processing, Jira, Data Pipelines, Docker, Amazon S3 (AWS S3), Test-driven Development (TDD), GitHub, Amazon EC2

Head of Cloud Engineering

2018 - 2019
Amplitude
  • Led the cloud engineering team. The team's key responsibilities included increasing developer efficiency and looking after the scalability, stability, and security of our production infrastructure.
  • Scaled the team from one to three full-time engineers and two contractors.
  • Refactored Terraform (Infrastructure as Code) code to increase usability by developer teams, modularity, and reduction of blast radius for unintended changes.
  • Introduced a log aggregation tool (Datadog) and drove developer education and usability, resulting in log aggregation becoming an integral part of our monitoring strategy and incident response procedure.
  • Introduced self-serve developer staging environments, where developers could deploy development versions of our application to test and share before merging into master.
  • Introduced Kubernetes as a container orchestrator; our first production service was launched in March 2019. Drove the operationalization, monitoring, and stability strategy for Kubernetes deployments.
  • Contributed to the ingestion system, which handles around 200,000 events per second, and our query engine, which runs 2PB of data in under 12 seconds (P95).
  • Introduced the concept of SLOs and drove the adoption of SLAs, SLOs, and error budgets as a common language to balance development velocity with stability.
  • Negotiated contracts with Threat Stack and Datadog, which resulted in a 40% and 30% reduction in estimated costs, respectively, while keeping full functionality.
  • Led the technical aspects of getting SOC2 certified and becoming an AWS Competency Partner.
Technologies: Productivity, Engineering Management, DevOps, AWS IoT, Terraform, Infrastructure as Code (IaC), Datadog, Contract Negotiation, Big Data, Site Reliability Engineering (SRE), Jira, CI/CD Pipelines, Docker, Kubernetes, Amazon S3 (AWS S3), Test-driven Development (TDD), Agile Software Development, GitHub, Amazon EC2

Senior Software Engineer

2018 - 2018
Amplitude
  • Utilized Spinnaker to improve our delivery strategy and replace custom scripts orchestrating Salt deployments.
  • Introduced Vault for secret management, developed Python and Java clients, and migrated key services.
  • Implemented automated testing for pull requests, so test breakages could be detected before merging code into master.
  • Created a framework for running tests in isolated environments using Docker Compose. Previously, tests were run using external resources, resulting in flaky tests and tests that had to be run serially.
Technologies: DevOps, Spinnaker, Continuous Integration (CI), Continuous Deployment, Git, Jira, CI/CD Pipelines, Docker, Kubernetes, Amazon S3 (AWS S3), Test-driven Development (TDD), Agile Software Development, GitHub, Amazon EC2

Senior Software Engineer

2013 - 2017
Schrodinger, Inc.
  • Led the development of a scalable remote execution server, TaskEngine, used by our flagship LiveDesign and FEP products to run scientific simulations asynchronously (Django and Celery-based).
  • Developed a cloud-agnostic deployment tool, Spinner. Enabled the configuration and deployment of full LiveDesign stacks in less than 10 minutes.
  • Led the development of a data analysis and configuration tool, LD Admin. This tool will be used to perform advanced configuration and to get usage statistics of LiveDesign servers.
  • Maintained and supported continuous integration, testing, and deployment infrastructure.
  • Tested, packaged, published, and deployed enterprise Python products.
  • Developed cookbooks to automate deployment tasks using Chef.
Technologies: Amazon Web Services (AWS), Django REST Framework, Jenkins, Chef, MongoDB, PostgreSQL, RabbitMQ, Celery, Django, Python, Jira, CI/CD Pipelines, Docker, Amazon S3 (AWS S3), Test-driven Development (TDD), Agile Software Development, GitHub, Amazon EC2, JavaScript

Office of Technology Transfer - Fellow

2011 - 2013
University of Michigan
  • Reviewed technologies presented for evaluation and determined potential applications and markets.
  • Identified technological and legal challenges for the commercialization of these new technologies.
  • Did work study during the doctoral program at the University of Michigan Materials Science and Engineering Department.
Technologies: Technology Transfer

Graduate Research Assistant

2008 - 2013
University of Michigan
  • Implemented statistical mechanics-based computational methods to simulate systems far from equilibrium.
  • Designed and developed simulation software for high-performance computers (MPI and OpenMP) and general-purpose graphics processing units (GPU, CUDA).
  • Designed the user interface for our simulation software, HOOMD, using C++ and Python.
Technologies: OpenMP, MPI, NVIDIA CUDA, C++, Python, Amazon S3 (AWS S3)

Storage Support Senior Analyst

2003 - 2008
Dell, Inc.
  • Troubleshot, deployed, and validated advanced EMC CX-series solutions on switched fabrics and AX-series on fiber channel and iSCSI configurations.
  • Rectified storage configurations, EMC, and PowerVault solutions on SAN and NAS environments.
  • Troubleshot and deployed Windows and Linux-based PowerEdge servers.
  • Resolved network-related issues in PowerConnect switches.
Technologies: Storage Area Networks (SAN), Dell EMC, Technical Support, Networking, Linux

Amplitude's DevOps Journey

https://www.youtube.com/watch?v=ID5Qtk6TTSw
DevOps at Amplitude

This is a talk about the lessons we learned building a healthy DevOps culture, the role of the DevOps team, the tools we lean on, and the challenges ahead. It was a snapshot of the evolution of software development at Amplitude during my tenure as head of cloud engineering.

LiveDesign

https://www.schrodinger.com/livedesign
The Advantages of Real-time Collaborative Design

A project team can generate new ideas far more quickly than it can record and analyze each idea; also, great ideas can happen anytime, not just during regularly scheduled meetings. Every dropped idea is a missed opportunity to find a path forward through the complex maze of drug discovery.

LiveDesign allows every idea to be captured, shared, analyzed, and prioritized—leading to a fuller exploration of the chemical space while facilitating better communication across the different functional groups. Any project team member can enter an idea and instantly get feedback using computational models to help further refine the design—resulting in real-time collaboration across time and location barriers.

My primary responsibility was designing, developing, and maintaining the infrastructure that runs the scientific simulations to inform design decisions.

FEP+ on the Cloud

https://www.schrodinger.com/science-articles/free-energy-methods-fep
Achieving highly potent binding—while maintaining a host of other ligand properties required for safety and biological efficacy—is a primary objective of small molecule drug discovery.

Seeing this unmet need, we embarked on a multiyear research project to develop a new free energy calculation technology (FEP+). Our objective was to provide a rigorous approach for computing binding free energies that offers significant value to industrial drug discovery efforts. We are pleased to report that, after utilizing the FEP+ technology on seven different active drug discovery collaborations over the past year, we now have firm evidence that the free energy approach developed in FEP+ can facilitate better synthesis decisions during lead optimization.

My responsibilities were to do the initial, on-cloud proof of concept by automatically scaling FEP+ simulations in AWS while minimizing infrastructure costs.

Languages

Python, Java, Go, GraphQL, C++, SQL, JavaScript

Frameworks

Spring Boot, Django, Django REST Framework

Paradigms

DevOps, Test-driven Development (TDD), Agile Software Development, Continuous Integration (CI), Continuous Deployment

Platforms

Amazon EC2, Spinnaker, Linux, Amazon, Apache Kafka, Docker, MacOS, Amazon Web Services (AWS), NVIDIA CUDA, AWS IoT, Kubernetes

Other

CI/CD Pipelines, Stream Processing, Technology Transfer, Networking, Technical Support, Productivity, Engineering Management, Infrastructure as Code (IaC), Contract Negotiation, Big Data, Site Reliability Engineering (SRE), Materials Science, GPU Computing, Statistical Analysis, Physics, Simulations, Markov Chain Monte Carlo (MCMC) Algorithms, Medical Devices, Electronics, Microprocessors, Physics Simulations

Tools

Chef, Jira, PyCharm, Dell EMC, Amazon Virtual Private Cloud (VPC), Celery, Jenkins, RabbitMQ, Boto, GitHub, Git, Terraform

Storage

Data Pipelines, Storage Area Networks (SAN), MongoDB, PostgreSQL, Amazon S3 (AWS S3), Datadog

Libraries/APIs

MPI, OpenMP, Tastypie

2008 - 2013

Ph.D. in Materials Science and Engineering, Computational Emphasis

University of Michigan - Ann Arbor, Michigan, USA

2005 - 2008

Bachelor of Science in Electrical Engineering

University of Texas - Austin, Texas, USA

AUGUST 2016 - AUGUST 2018

AWS Certified Solutions Architect

Amazon Web Services

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring