Alan Sammarone, Developer in Amsterdam, Netherlands
Alan is available for hire
Hire Alan

Alan Sammarone

Machine Learning Developer

Amsterdam, Netherlands

Toptal member since March 28, 2017

Bio

Alan is an innovative software, research, and machine learning (ML) engineer with over a decade of experience idealizing, researching, building, and deploying machine learning applications. He excels in fast-paced startup environments and drives cutting-edge AI/ML solutions from concept to deployment.

Portfolio

Enza Zaden
Python, Azure, PyTorch, JAX, Computer Vision, Spectroscopy, Terraform, Linux...
Nav
Python, Kubernetes, JAX, PyTorch, Apache Kafka, Linux, Machine Learning...
Tillful
Python, PyTorch, Kubernetes, Machine Learning, PostgreSQL, Apache Kafka, Linux...

Experience

  • Python - 12 years
  • Linux - 10 years
  • PostgreSQL - 10 years
  • Machine Learning - 6 years
  • PyTorch - 5 years
  • Kubernetes - 4 years
  • Apache Kafka - 3 years
  • Terraform - 2 years

Preferred Environment

Python

The most amazing...

...thing I've built and deployed was an ML system combining several cutting-edge techniques, such as weak supervision and latent space anchoring.

Work Experience

Lead AI Tech Lead and Architect

2024 - PRESENT
Enza Zaden
  • Built a team of ML engineers and data scientists aimed at being the core group responsible for guiding the company through a data-driven transformation phase.
  • Collaborated with the data science, biology, bioinformatics, and robotics teams to improve the ML solution lifecycle management.
  • Worked with business stakeholders to define the company's machine learning strategy for the next 3-5 years and build the core infrastructure and tooling used by multiple R&D and operations teams within the company.
Technologies: Python, Azure, PyTorch, JAX, Computer Vision, Spectroscopy, Terraform, Linux, Machine Learning, Git, Mathematical Analysis, Neural Networks, NumPy, Docker, FastAPI, Streamlit, Python 3, Large Language Models (LLMs), OpenAI, Technical Leadership, Software Architecture, Web Development, AI Agents, Azure Blob Storage, Artificial Intelligence (AI), Cloud Architecture, Biology, Retrieval-augmented Generation (RAG), LangChain, Machine Learning Operations (MLOps), Genomics, Real-time Systems, Statistical Modeling, Google Cloud Platform (GCP), CI/CD Pipelines, Helm, DevOps, Grafana, Cloud Monitoring, Test-driven Development (TDD), SciPy, Pytest, AI Automation

Principal AI + Full Stack Engineer

2023 - 2024
Nav
  • Designed and led a team implementing the company-wide software infrastructure aimed at serving various machine learning models at scale with real-time inferences using an event-based architecture with Kafka.
  • Collaborated with technical and product stakeholders to create and implement a migration from nightly batch jobs to real-time processing with Kafka. This ultimately led to features being available to end users much faster and reduced customer churn.
  • Migrated the acquisition IP to the company's infrastructure.
Technologies: Python, Kubernetes, JAX, PyTorch, Apache Kafka, Linux, Machine Learning, Amazon Web Services (AWS), Git, Neural Networks, PostgreSQL, NumPy, Docker, APIs, Algorithms, Data Science, FastAPI, Flask, SQLAlchemy, SQL, Python 3, Large Language Models (LLMs), Data Visualization, Data Analysis, Technical Leadership, Software Architecture, Web Development, Artificial Intelligence (AI), Distributed Systems, Redis, Event-driven Design (EDD), Cloud Architecture, Google Cloud, Retrieval-augmented Generation (RAG), Machine Learning Operations (MLOps), Real-time Systems, Statistical Modeling, CI/CD Pipelines, Helm, DevOps, Grafana, Cloud Monitoring, Test-driven Development (TDD), SciPy, Pytest, GitLab CI/CD, Cloud Run

Senior ML Full Stack Engineer

2018 - 2023
Tillful
  • Transformed a proof-of-concept into a fully functional, production-ready financial transaction categorization engine, employing natural language processing, time series analysis, and weak supervision techniques.
  • Designed and implemented the production-ready machine learning pipeline for a pre-incident model used by one of Europe's largest banks. The pipeline is capable of handling close to 1TB at every run and utilizes Spark, Kubeflow Pipelines, XGBoost, and Kubernetes.
  • Collaborated closely with research scientists, software engineers, architects, and stakeholders to design and implement multiple machine learning solutions aimed at serving machine learning models at scale to Fortune 500 companies.
Technologies: Python, PyTorch, Kubernetes, Machine Learning, PostgreSQL, Apache Kafka, Linux, Amazon Web Services (AWS), Git, Mathematical Analysis, Neural Networks, NumPy, Docker, Trading Systems, Trading, APIs, Algorithms, Data Science, Flask, SQLAlchemy, Pandas, SQL, Twilio, Python 3, Large Language Models (LLMs), Data Visualization, Data Analysis, Software Architecture, Web Development, Celery, RabbitMQ, Distributed Systems, Event-driven Design (EDD), Cloud Architecture, Machine Learning Operations (MLOps), Real-time Systems, CI/CD Pipelines, DevOps, Grafana, Cloud Monitoring, Test-driven Development (TDD), SciPy, Pytest, GitLab CI/CD

Senior Developer

2014 - 2017
Simbiose Ventures
  • Created a machine-learning pipeline as well as a REST API to access it, which was able to categorize websites according to their contents.
  • Optimized various parts of the company's system, generating a 5 to 10-fold performance improvement and decreased costs.
  • Migrated the company's storage architecture to a hybrid of Amazon Glacier and Amazon S3, decreasing storage costs by 30%.
  • Idealized and oversaw the creation of a system for extracting product prices for any given URL representing a product listing.
Technologies: Amazon Web Services (AWS), Amazon, C, Aerospike, Elasticsearch, Python, Linux, Git, Mathematical Analysis, PostgreSQL, JavaScript, APIs, MySQL, Data Scraping, Algorithms, Web Scraping, SQLAlchemy, Pandas, API Integration, Django, SQL, React, Python 3, Cython, Web Development, Distributed Systems, Real-time Systems, Test-driven Development (TDD), Pytest

Software Developer

2012 - 2014
Positivo Informática
  • Coded highly optimized, browser-based mathematical and physical simulations aimed at helping teaching high school children visualize concepts.
  • Wrote an automation tool used to update and deploy thousands of applications efficiently.
  • Optimized many legacy JavaScript projects to make them run on low-end tablets.
Technologies: Python, Git, Linux, JavaScript, MySQL, SQL, Python 3, Web Development

Junior Developer

2010 - 2012
Aymará Editora
  • Coded and supported a social network aimed at children.
  • Wrote JavaScript animations and games for tablets.
  • Created a framework to sync information from different databases.
Technologies: PHP, Git, Linux, SQL

Junior Developer

2009 - 2010
Totalize Internet Studio
  • Created websites for small and medium-sized businesses using a proprietary framework.
  • Customized administrative tools and worked closely with engineers focused on securing the platform.
  • Communicated with clients to gather requirements and triage bug reports.
Technologies: PHP, Git, Linux

Experience

Scalable and Weakly Supervised Bank Transaction Classification

https://arxiv.org/abs/2305.18430
We developed an end-to-end system for bank transaction classification using weak supervision and deep learning.

I was involved in the ideation and research phases and led the productionization effort. My work included designing and implementing the real-time inference architecture, integrating the system with Kafka and KSQL for event batching, and deploying scalable models via Kubernetes and Kubeflow Pipelines.

The system uses heuristics and unsupervised embeddings to generate weak labels, which are then used to train GRU-based discriminative models. Our pipeline outperformed Plaid API in several tasks, achieving over 90% accuracy across nine financial categories. The architecture was optimized for rapidly onboarding new classification tasks with minimal manual effort.

Voice-controlled Customer Service Agent

The project is a Python back end for an MVP that delivers an interactive voice interface capable of handling end-to-end customer support tasks without human intervention. It leverages three OpenAI capabilities in sequence: gpt-4o-transcribe to convert speech to text, gpt-4o (with tool-calling support) to process the input and generate a response, and gpt-4o-mini-tts to convert the response back to speech.

Education

2022 - 2024

Master's Degree in Theoretical Physics

University of Amsterdam - Amsterdam

2020 - 2022

Master's Degree in Computational Science

University of Amsterdam - Amsterdam

2016 - 2019

Bachelor's Degree in Physics

Universität Leipzig - Leipzig, Germany

Skills

Libraries/APIs

NumPy, SciPy, SQLAlchemy, Pandas, Flask API, PyTorch, JAX, React, ArcGIS

Tools

Pytest, Terraform, Apache Airflow, Celery, RabbitMQ, Helm, Grafana, GitLab CI/CD, Git

Languages

Python, SQL, Python 3, JavaScript, PHP, C

Frameworks

Flask, Django, Streamlit

Paradigms

Real-time Systems, DevOps, Test-driven Development (TDD), Event-driven Design (EDD)

Storage

PostgreSQL, MySQL, Redis, Elasticsearch, Aerospike, Google Cloud

Platforms

Amazon Web Services (AWS), Kubernetes, Apache Kafka, Docker, Raspberry Pi, Twilio, Google Cloud Platform (GCP), Cloud Run, Amazon, Linux, Azure

Industry Expertise

Trading Systems

Other

Machine Learning, APIs, Large Language Models (LLMs), Web Development, Artificial Intelligence (AI), Machine Learning Operations (MLOps), CI/CD Pipelines, Data Scraping, Algorithms, Web Scraping, Data Science, FastAPI, API Integration, Cython, OpenAI, Data Visualization, Data Analysis, Technical Leadership, Software Architecture, AI Agents, Azure Blob Storage, Distributed Systems, Cloud Architecture, Retrieval-augmented Generation (RAG), LangChain, Prompt Engineering, Statistical Modeling, Voice AI, DSP, Cloud Monitoring, AI Automation, Mathematical Analysis, Apache Cassandra, Neural Networks, Physics, Computer Vision, Spectroscopy, Chrome Extensions, Trading, Leadership, Ideation, Delivery, Biology, Genomics

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring