
Dawid Smoleń
Verified Expert in Engineering
Machine Learning Engineer and Software Developer
Kraków, Poland
Toptal member since September 15, 2021
Dawid has successfully delivered 30+ machine learning projects, building scalable systems grounded in MLOps best practices. With deep expertise in cloud-native technologies, he automates the entire ML lifecycle, from data gathering to CI/CD pipelines and continuous training. Recently, Dawid has been applying this expertise to LLM-based solutions at scale, helping clients unlock cutting-edge AI capabilities.
Portfolio
Experience
- Scikit-learn - 10 years
- Machine Learning Automation - 6 years
- PyTorch - 6 years
- Generative Pre-trained Transformers (GPT) - 6 years
- CI/CD Pipelines - 5 years
- Azure - 4 years
- Kubeflow - 2 years
- Agentic AI - 1 year
Preferred Environment
Python, Scikit-learn, PyTorch, Kubernetes, MongoDB, Cloud, OpenAI, Temporal Cloud
The most amazing...
...professional achievement was engineering a high-density model deployment platform that runs over 1,000 ML models in production.
Work Experience
ML Consultant | MLOps Engineer
Freelance
- Deployed modeling services to Kubernetes clusters, Amazon EKS, and Google Kubernetes Engine (GKE).
- Introduced tracking servers to the existing projects to improve the observability of a model and understanding of a problem.
- Developed an end-to-end solution from data investigation to a deployed model that monitors daily statistics and business metrics regarding user experience in eCommerce.
- Consulted an ECG-related company from Latin America. Helped with the design and implementation of crucial Holter analysis steps.
- Prepared NFT market analysis tools based on machine learning traits valuation.
- Prepared a deduplication service for a real estate website scraper.
- Acted as a data science trainer for two training companies and conducted training for around seven teams from various enterprises.
MLOps Engineer
Sinch
- Introduced the best MLOPS practices at Chatlayer, managing thousands of models in production. Maintained them and also significantly optimized the costs and speed.
- Drove the adoption of AI solutions across multiple departments at Sinch, including automated campaign analytics, anomaly detection in messaging systems (email and SMS), system integration, and the development of unified standards.
- Participated in a few LLM and agentic projects running at a large scale, transforming industries.
- Worked with cutting-edge technologies, including GitOps and event-based architecture, as well as workflow automation using Temporal and Argo Workflows.
- Migrated massive projects between popular cloud providers.
- Created an anomaly detection and prediction system for the mailing industry. It processes 2-3 billion emails daily.
- Enhanced observability by adding tools at multiple levels.
- Implemented a custom high-density model deployment platform (a Kubernetes engine for 2,000 models).
Machine Learning Consultant
Toptal
- Introduced MLOps design patterns, including pipelines, observability tools, and monitoring solutions, and deployment to a Kubernetes cluster.
- Built a PoC for an automatic real estate valuation system. Set up the codebase and pipelines, conducted research, and developed a few competitive prototypes.
- Participated in establishing a team by hiring and training members who later took over the project.
Machine Learning Engineer
Grape Up
- Developed an end-to-end deep learning automotive project together with full automation (CI, CD, and CT) and infrastructure. Worked on machine learning best practices using modern tools and solutions.
- Created POCs and demos in machine learning and data science areas, together with simple UI demos and an API first approach.
- Built a VIN recognition system. https://medium.com/grapeup/leveraging-ai-to-improve-vin-recognition-how-to-accelerate-and-automate-operations-in-the-12eac5286b1d.
- Created a blog post. Building Intelligent Document Processing Systems: https://grapeup.com/blog/introduction-to-building-intelligent-document-processing-systems/.
Deep Learning Engineer
Lekta
- Created a library for users' intent classification that employs industry best practices to make predictions millions of times a month in a real-time, demanding environment.
- Developed a novel speech recognition system based on state-of-the-art papers that beat the current market in some areas in terms of accuracy or performance.
- Researched numerous topics in the areas of speech recognition, voice-based gender recognition, intent classification, sentence representation, and text representation.
- Developed machine learning algorithms for both voice bots and chatbots.
Machine Learning Engineer
Aspel SA
- Created a brand new QRS detector tested on many benchmarks and real-world monitoring tests.
- Developed clustering algorithms that can efficiently cluster long Holter monitor tests, focusing on user experience.
- Developed embedded resampling algorithms for ECG devices.
- Contributed to QRS morphology classifiers that highly improved the work of doctors and met AMA standards.
- Helped develop user experience-related algorithms that simplify the work of the doctors and technicians.
NLP Engineer
WitKom – Virtual Translator of Sign Communication
- Developed the first Polish to Polish Sign Language translation system on the language level.
- Built the first Polish Sign Language to Polish translation system on the language level using Seq2Seq models.
- Created huge artificial datasets for sign languages based on heuristics, rules, and DL technology.
Experience
Gomrade — Play Go Against AI on a Real, Physical Board
https://github.com/smolendawid/GomradeSpeech Representation and Exploration Notebook
https://www.kaggle.com/davids1992/speech-representation-and-data-explorationThe Simplest Python Cache for Data Scientists
https://github.com/smolendawid/cachaContrary to many other tools, cacha boasts the following features:
• It is used at the function call, not the definition. Many packages implement the @cache decorator that has to be used before the definition of a function that is not easy enough to use.
• It stores the cache on disk, which means you can use the cache between runs. This is convenient in data science work.
Drifting – The Most Flexible Drift Detection Server
https://github.com/sign-ai/driftingPYTHON-FIRST
Communicate with the Drift Detection server using a super simple Python client. No additional management needed!
EASY INTEGRATIONS
Using drifting is simple thanks to standardized, ML server-based integrations like Kafka, OpenAPI, and gRPC.
FLEXIBLE
One server for managing many models, projects, versions, and features without any further tools.
STATE-OF-THE-ART
An open-source project built upon the top-tier libraries—Alibi Detect, ML server, and more!
My blogging
https://signai.substack.com/- https://medium.com/@smolendawid - migration of a few blogposts from my old personal blog
- https://signai.substack.com/ - more up-to-date blogposts.
Education
PhD in Electrical and Electronics Engineering
AGH University of Science and Technology - Cracow, Poland
Master's Degree in Acoustical Engineering
AGH University of Science and Technology - Cracow, Poland
Certifications
ML Practitioner
Dataiku
Core Designer
Dataiku
Machine Learning
Coursera
Neural Networks and Deep Learning
Coursera
Structuring Machine Learning Projects
Coursera
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
Coursera
Skills
Libraries/APIs
Scikit-learn, PyTorch, TensorFlow, REST APIs, Node.js, React, OpenCV, Keras, SciPy, Natural Language Toolkit (NLTK), Pandas
Tools
Observability Tools, Google Kubernetes Engine (GKE), MATLAB, Helm, Amazon SageMaker, Terraform
Languages
Python, C++, SQL
Paradigms
Continuous Integration (CI), DevOps, Anomaly Detection, ETL
Platforms
Azure, Kubeflow, Jupyter Notebook, Amazon Web Services (AWS), Kubernetes, Temporal Cloud, Dataiku, Docker, Cloud Native, Databricks, Amazon EC2
Storage
Google Cloud, Amazon S3 (AWS S3), MongoDB
Frameworks
Metaflow, LangGraph
Other
Machine Learning Automation, Audio Processing, Natural Language Processing (NLP), Deep Neural Networks (DNNs), Deep Learning, Machine Learning, Sequence Models, Machine Learning Operations (MLOps), ECG, Data Science, Artificial Intelligence (AI), CI/CD Pipelines, Generative Pre-trained Transformers (GPT), Large Language Model Operations (LLMOps), APIs, FastAPI, ML Pipelines, Model Deployment, Monitoring, Temporal, Workflows Orchestration, GitOps, Data Scraping, Data Engineering, Data Analysis, Large Language Models (LLMs), Prompt Engineering, OpenAI, Argo Workflows, Infrastructure as Code (IaC), Agentic AI, Acoustics, Digital Signal Processing, Speech Recognition, Convolutional Neural Networks (CNNs), Training, Chatbots, Predictive Modeling, Regression Modeling, Classification Algorithms, Large-scale Projects, Front-end, Non-fungible Tokens (NFT), Acoustical Engineering, Retrieval-augmented Generation (RAG), IT Project Management, Lecturing, LangChain, Vector Databases, Generative Artificial Intelligence (GenAI), AI Agents, Machine Learning Algorithms, Events, PDF Scraping, Data Extraction, DSP, Cloud
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring