
Alan Sammarone
Verified Expert in Engineering
Machine Learning Developer
Amsterdam, Netherlands
Toptal member since March 28, 2017
Alan is an innovative software, research, and machine learning (ML) engineer with over a decade of experience idealizing, researching, building, and deploying machine learning applications. He excels in fast-paced startup environments and drives cutting-edge AI/ML solutions from concept to deployment.
Portfolio
Experience
- Python - 12 years
- Linux - 10 years
- PostgreSQL - 10 years
- Machine Learning - 6 years
- PyTorch - 5 years
- Kubernetes - 4 years
- Apache Kafka - 3 years
- Terraform - 2 years
Availability
Preferred Environment
Python
The most amazing...
...thing I've built and deployed was an ML system combining several cutting-edge techniques, such as weak supervision and latent space anchoring.
Work Experience
Lead Machine Learning Engineer
Enza Zaden
- Built a team of ML engineers and data scientists aimed at being the core group responsible for guiding the company through a data-driven transformation phase.
- Collaborated with the data science, biology, bioinformatics, and robotics teams to improve the ML solution lifecycle management.
- Worked with business stakeholders to define the company's machine learning strategy for the next 3-5 years and build the core infrastructure and tooling used by multiple R&D and operations teams within the company.
Principal Machine Learning Engineer
Nav
- Designed and led a team implementing the company-wide software infrastructure aimed at serving various machine learning models at scale with real-time inferences using an event-based architecture with Kafka.
- Collaborated with technical and product stakeholders to create and implement a migration from nightly batch jobs to real-time processing with Kafka. This ultimately led to features being available to end users much faster and reduced customer churn.
- Migrated the acquisition IP to the company's infrastructure.
Senior Machine Learning Engineer
Tillful
- Transformed a proof-of-concept into a fully functional, production-ready financial transaction categorization engine, employing natural language processing, time series analysis, and weak supervision techniques.
- Designed and implemented the production-ready machine learning pipeline for a pre-incident model used by one of Europe's largest banks. The pipeline is capable of handling close to 1TB at every run and utilizes Spark, Kubeflow Pipelines, XGBoost, and Kubernetes.
- Collaborated closely with research scientists, software engineers, architects, and stakeholders to design and implement multiple machine learning solutions aimed at serving machine learning models at scale to Fortune 500 companies.
Senior Developer
Simbiose Ventures
- Created a machine-learning pipeline as well as a REST API to access it, which was able to categorize websites according to their contents.
- Optimized various parts of the company's system, generating a 5 to 10-fold performance improvement and decreased costs.
- Migrated the company's storage architecture to a hybrid of Amazon Glacier and Amazon S3, decreasing storage costs by 30%.
- Idealized and oversaw the creation of a system for extracting product prices for any given URL representing a product listing.
Software Developer
Positivo Informática
- Coded highly optimized, browser-based mathematical and physical simulations aimed at helping teaching high school children visualize concepts.
- Wrote an automation tool used to update and deploy thousands of applications efficiently.
- Optimized many legacy JavaScript projects to make them run on low-end tablets.
Junior Developer
Aymará Editora
- Coded and supported a social network aimed at children.
- Wrote JavaScript animations and games for tablets.
- Created a framework to sync information from different databases.
Junior Developer
Totalize Internet Studio
- Created websites for small and medium-sized businesses using a proprietary framework.
- Customized administrative tools and worked closely with engineers focused on securing the platform.
- Communicated with clients to gather requirements and triage bug reports.
Experience
Scalable and Weakly Supervised Bank Transaction Classification
https://arxiv.org/abs/2305.18430I was involved in the ideation and research phases and led the productionization effort. My work included designing and implementing the real-time inference architecture, integrating the system with Kafka and KSQL for event batching, and deploying scalable models via Kubernetes and Kubeflow Pipelines.
The system uses heuristics and unsupervised embeddings to generate weak labels, which are then used to train GRU-based discriminative models. Our pipeline outperformed Plaid API in several tasks, achieving over 90% accuracy across nine financial categories. The architecture was optimized for rapidly onboarding new classification tasks with minimal manual effort.
Voice-controlled customer service agent
Education
Master's Degree in Theoretical Physics
University of Amsterdam - Amsterdam
Master's Degree in Computational Science
University of Amsterdam - Amsterdam
Bachelor's Degree in Physics
Universität Leipzig - Leipzig, Germany
Skills
Libraries/APIs
SQLAlchemy, Pandas, Flask API, PyTorch, NumPy, SciPy, JAX, React, ArcGIS
Tools
Terraform, Apache Airflow, Celery, RabbitMQ, Helm, Git
Languages
Python, SQL, Python 3, JavaScript, PHP, C
Frameworks
Flask, Django, Streamlit
Paradigms
Real-time Systems, DevOps, Event-driven Design (EDD)
Storage
PostgreSQL, MySQL, Redis, Elasticsearch, Aerospike, Google Cloud
Platforms
Amazon Web Services (AWS), Kubernetes, Apache Kafka, Docker, Raspberry Pi, Twilio, Google Cloud Platform (GCP), Amazon, Linux, Azure
Industry Expertise
Trading Systems
Other
Machine Learning, APIs, Large Language Models (LLMs), Web Development, Artificial Intelligence (AI), Machine Learning Operations (MLOps), CI/CD Pipelines, Data Scraping, Algorithms, Web Scraping, Data Science, FastAPI, API Integration, Cython, OpenAI, Data Visualization, Data Analysis, Technical Leadership, Software Architecture, AI Agents, Azure Blob Storage, Distributed Systems, Cloud Architecture, Retrieval-augmented Generation (RAG), LangChain, Prompt Engineering, Statistical Modeling, Voice AI, DSP, Mathematical Analysis, Apache Cassandra, Neural Networks, Physics, Computer Vision, Spectroscopy, Chrome Extensions, Trading, Leadership, Ideation, Delivery, Biology, Genomics
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring