Dawid Smoleń, Developer in New York, NY, United States
Dawid is available for hire
Hire Dawid

Dawid Smoleń

Verified Expert  in Engineering

Bio

Dawid has delivered more than 30 successful projects in data science and machine learning. He has worked with both classical and deep learning solutions in a few industries. Dawid is focused on creating systems that follow MLOps best practices and design patterns. Having experience with many cloud providers, he is able to automate the whole ML process, from data gathering to automated deployments and continuous training.

Portfolio

Freelance
Python, Scikit-learn, Docker, Amazon Web Services (AWS)...
Sinch
Kubernetes, Google Kubernetes Engine (GKE), Docker, Helm, GitOps...
Grape Up
Azure, Convolutional Neural Networks (CNNs), Deep Learning, Dataiku, Python...

Experience

Availability

Part-time

Preferred Environment

Ubuntu, Azure, Amazon Web Services (AWS), Google Cloud, Python, Scikit-learn, PyTorch, TensorFlow

The most amazing...

...things I've created are reliable systems not only in terms of data modeling methodology but also in implementing the best MLOps design patterns.

Work Experience

ML Consultant | MLOps Engineer

2018 - PRESENT
Freelance
  • Acted as a data science trainer for two training companies and conducted training for around seven teams from various enterprises.
  • Deployed modeling services to Kubernetes clusters, Amazon EKS and Google Kubernetes Engine (GKE).
  • Introduced tracking servers to the existing projects to improve the observability of a model and understanding of a problem.
  • Developed an end-to-end solution from data investigation to a deployed model that monitors daily statistics and business metrics regarding user experience in eCommerce.
  • Consulted an ECG-related company from Latin America. Helped with the design and implementation of crucial Holter analysis steps.
  • Prepared NFT market analysis tools based on machine learning traits valuation.
  • Prepared deduplication service for real-estate website scraper.
Technologies: Python, Scikit-learn, Docker, Amazon Web Services (AWS), Digital Signal Processing, ECG, Training, Data Science, Jupyter Notebook, Artificial Intelligence (AI), Machine Learning, Regression Modeling, Classification Algorithms, Kubeflow, Kubernetes, CI/CD Pipelines, Machine Learning Operations (MLOps), Data Scraping, Data Engineering, Front-end, Data Analysis, Non-fungible Tokens (NFT)

MLOps Engineer

2022 - 2023
Sinch
  • Managed thousands of models in production. Maintained them and also significantly optimized the costs and speed.
  • Added a lot of observability tools on many levels.
  • Worked with the hottest tech, including GitOps and event-based architecture.
  • MIgrated massive projects between popular cloud providers.
Technologies: Kubernetes, Google Kubernetes Engine (GKE), Docker, Helm, GitOps, Large-scale Projects, Artificial Intelligence (AI), Data Engineering, Python

Machine Learning Engineer

2020 - 2021
Grape Up
  • Developed an end-to-end deep learning automotive project together with full automation (CI, CD, and CT) and infrastructure.
  • Created POCs and demos in machine learning and data science areas together with simple UI demos and API first approach.
  • Contributed to the company's entry into the AI market, working on papers, blog posts, offers, and creating POCs.
  • Worked on machine learning best practices using modern tools and solutions.
Technologies: Azure, Convolutional Neural Networks (CNNs), Deep Learning, Dataiku, Python, Digital Signal Processing, DevOps, Machine Learning Operations (MLOps), REST APIs, React, Deep Neural Networks (DNNs), Metaflow, Data Science, Jupyter Notebook, Artificial Intelligence (AI), Predictive Modeling, Machine Learning, Amazon Web Services (AWS), Data Scraping

Deep Learning Engineer

2017 - 2018
Lekta
  • Created a library for users' intent classification that employs industry best practices to make predictions millions of times a month in a real-time, demanding environment.
  • Developed a novel speech recognition system based on state-of-the-art papers that beat the current market in some areas in terms of accuracy or performance.
  • Researched numerous topics in the areas of speech recognition, voice-based gender recognition, intent classification, sentence representation, and text representation.
  • Developed machine learning algorithms for both voice bots and chatbots.
Technologies: C++, Audio Processing, Digital Signal Processing, Python, PyTorch, TensorFlow, Deep Learning, Speech Recognition, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Scikit-learn, Data Science, Jupyter Notebook, Chatbots, Artificial Intelligence (AI), Predictive Modeling, Amazon Web Services (AWS), Machine Learning, Data Scraping, Data Engineering

Machine Learning Engineer

2016 - 2017
Aspel SA
  • Created a brand new QRS detector tested on many benchmarks and real-world monitoring tests.
  • Developed clustering algorithms that can efficiently cluster long Holter monitor tests, focusing on user experience.
  • Developed embedded resampling algorithms for ECG devices.
  • Contributed to QRS morphology classifiers that highly improved the work of doctors and met AMA standards.
  • Helped develop user experience-related algorithms that simplify the work of the doctors and technicians.
Technologies: C++, Python, MATLAB, Scikit-learn, SciPy, Artificial Intelligence (AI), Predictive Modeling, Classification Algorithms, Regression Modeling, Data Analysis

NLP Engineer

2015 - 2016
WitKom – Virtual Translator of Sign Communication
  • Developed the first Polish to Polish Sign Language translation system on the language level.
  • Built the first Polish Sign Language to Polish translation system on the language level using Seq2Seq models.
  • Created huge artificial datasets for sign languages based on heuristics, rules, and DL technology.
Technologies: Python, Natural Language Toolkit (NLTK), Deep Learning, Sequence Models, TensorFlow, Data Analysis

Gomrade — Play Go Against AI on a Real, Physical Board

https://github.com/smolendawid/Gomrade
This repository allows you to play Go with strong AI on a real board. Gomrade analyses the board state from an image using a computer camera and answers the AI moves using a synthesized voice. The example video of Gomrade in action is under development.

Speech Representation and Exploration Notebook

https://www.kaggle.com/davids1992/speech-representation-and-data-exploration
This is one of the top 15 Kaggle notebooks ever, with more than 100,000 views. I introduced a few basic concepts about speech representation and performed data analysis looking for the most interesting examples from the dataset.

The Simplest Python Cache for Data Scientists

https://github.com/smolendawid/cacha
The simplest Python cache for data scientists.

Contrary to many other tools, cacha boasts the following features:

• It is used at the function call, not the definition. Many packages implement the @cache decorator that has to be used before the definition of a function that is not easy enough to use.
• It stores the cache on disk, which means you can use the cache between runs. This is convenient in data science work.

Drifting – The Most Flexible Drift Detection Server

https://github.com/sign-ai/drifting
The most flexible Drift Detection framework for everyone! Python-first, API-first, user-friendly, and open-source!

PYTHON-FIRST
Communicate with the Drift Detection server using a super simple Python client. No additional management needed!

EASY INTEGRATIONS
Using drifting is simple thanks to standardized, ML server-based integrations like Kafka, OpenAPI, and gRPC.

FLEXIBLE
One server for managing many models, projects, versions, and features without any further tools.

STATE-OF-THE-ART
An open-source project built upon the top-tier libraries—Alibi Detect, ML server, and more!
2019 - 2021

PhD in Electrical and Electronics Engineering

AGH University of Science and Technology - Cracow, Poland

2011 - 2016

Master's Degree in Acoustical Engineering

AGH University of Science and Technology - Cracow, Poland

AUGUST 2021 - PRESENT

ML Practitioner

Dataiku

AUGUST 2021 - PRESENT

Core Designer

Dataiku

SEPTEMBER 2017 - PRESENT

Machine Learning

Coursera

SEPTEMBER 2017 - PRESENT

Neural Networks and Deep Learning

Coursera

SEPTEMBER 2017 - PRESENT

Structuring Machine Learning Projects

Coursera

SEPTEMBER 2017 - PRESENT

Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization

Coursera

Libraries/APIs

Scikit-learn, PyTorch, TensorFlow, REST APIs, Node.js, React, OpenCV, Keras, SciPy, Natural Language Toolkit (NLTK)

Tools

Google Kubernetes Engine (GKE), MATLAB, Helm

Languages

Python, C++

Paradigms

Continuous Integration (CI), DevOps

Platforms

Azure, Kubeflow, Jupyter Notebook, Ubuntu, Dataiku, Docker, Amazon Web Services (AWS), Kubernetes

Storage

Google Cloud

Frameworks

Metaflow

Other

Machine Learning Automation, Audio Processing, Natural Language Processing (NLP), Deep Neural Networks (DNNs), Deep Learning, Machine Learning, Sequence Models, Machine Learning Operations (MLOps), ECG, Data Science, Artificial Intelligence (AI), CI/CD Pipelines, Generative Pre-trained Transformers (GPT), Data Scraping, Data Engineering, Data Analysis, Acoustics, Digital Signal Processing, Speech Recognition, Convolutional Neural Networks (CNNs), Training, Chatbots, Predictive Modeling, Regression Modeling, Classification Algorithms, GitOps, Large-scale Projects, Front-end, Non-fungible Tokens (NFT), Acoustical Engineering

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring