Michał Bieroński, Developer in Kraków, Poland
Michał is available for hire
Hire Michał

Michał Bieroński

Verified Expert  in Engineering

Machine Learning Engineer and Software Developer

Location
Kraków, Poland
Toptal Member Since
January 21, 2022

Michał has almost nine years of professional experience in data science, machine learning, and software development. He has a computer science background and can fit in data scientist and machine learning engineer roles. Michal has tackled multiple problems in conversational AI, NLP, computer vision, time series forecasting, social media analysis, learning from graph-structured data, supply chain analysis, data visualization, and production deployment.

Portfolio

Skillz
Snowflake, MLflow, Apache Airflow, Apache Spark, Amazon Web Services (AWS)...
BP
Python, Scikit-learn, Azure, Azure Machine Learning, Azure Data Factory...
Infosys
BERT, Streamlit, Dash, Azure, Scikit-learn, GitLab, GitLab CI/CD...

Experience

Availability

Part-time

Preferred Environment

Linux, PyCharm, Git, Zsh

The most amazing...

...thing I've developed is the Matchmaking Simulator, an AI-powered engine that led to a new player D30 retention lift of 33% and a D30 revenue lift of 11%.

Work Experience

Senior Data Scientist

2021 - PRESENT
Skillz
  • Built the Matchmaking Simulator engine by building an AI player model (combining multiple behavioral ML models) and replicating the production matchmaking engine in Python to allow for fast experimentation. The engine is still in use.
  • Improved the matchmaking engine and developed new algorithms using the simulator, leading to a D30 retention lift of 33% and a D30 revenue lift of 11%. Led inter-team efforts to deploy, monitor impact, and scale developed features.
  • Built a conversion ML model and integrated it as a component of the Matchmaking Simulator, allowing for experimentation on how the company can drive the conversion rate.
  • Operationalized the churn prediction model using AWS, Docker, Kubernetes, GitHub Actions, MLflow, Airflow, and Snowflake.
  • Researched industry-standard solutions, built a simulation framework, selected the most viable solution, and implemented production-ready code, allowing for efficient rating updates for multiplayer games.
  • Owned matchmaking, offering advice and answering questions of leadership and business stakeholders. Led experimentation and analytics matchmaking efforts. Cooperated with engineering, analytics, and product organizations.
  • Planned and supervised the work of a junior team member.
Technologies: Snowflake, MLflow, Apache Airflow, Apache Spark, Amazon Web Services (AWS), Java, Groovy, Python, Scikit-learn, XGBoost, Streamlit, Amazon S3 (AWS S3), Amazon EC2, Amazon Elastic Container Service (Amazon ECS), Amazon SageMaker, Algorithms, Machine Learning, Data Science, Artificial Intelligence (AI), SQL, NumPy, Pandas, DVC, Pytest, Continuous Integration (CI)

Senior Machine Learning Engineer

2020 - 2021
BP
  • Designed and built the cloud architecture for the data analysis platform for development and production in the Azure cloud.
  • Used platforms and tools like ADF, Azure Data Lake, Azure Databricks, Azure DevOps, Azure Key Vault, Delta Lake, MLflow, Azure Machine Learning, Azure Kubernetes Service, Azure Synapse Analytics, and Power BI.
  • Preprocessed, cleaned, and identified outliers in the training data for further examination with the business. Some example techniques used are DBSCAN and Isolation Forest.
  • Built the DNN regression model for the forecasting cost of oil drilling-related activities based on historical data. Used PyTorch and PyTorch Lightning.
  • Conducted experiments and statistical tests like analysis of variance (ANOVA) and applied AI solutions.
  • Managed cloud automation of the labor and overhead of the cost forecasting process, previously done manually in Excel sheets by analysts.
  • Administrated the Databricks cloud data analysis platform. Built and optimized existing ETL pipelines using PySpark. Introduced production monitoring with Azure application insights into the project.
  • Maintained and developed the time series forecasting library. Managed the production submission of the quarterly cost forecasts, decreasing the running time of the quarterly forecast 14 times utilizing multiprocessing.
  • Built and deployed the web API using FastAPI, Docker, and Azure Web App to implement time series models to make them accessible for non-data scientists within the company. Built the Power BI report showcasing the usage of the API.
  • Created the auto-deployment pipeline for the newest version of the internal time series forecasting library into the Spark cluster. Built the CI pipeline on Azure DevOps, running code quality checks and unit tests with training for the team.
Technologies: Python, Scikit-learn, Azure, Azure Machine Learning, Azure Data Factory, Azure Data Lake, Databricks, Azure Key Vault, Delta Lake, MLflow, Azure Kubernetes Service (AKS), Azure Container Instances, Azure Functions, Azure Synapse, Microsoft Power BI, PySpark, Spark, Machine Learning Operations (MLOps), Azure DevOps, ETL, Web Applications, FastAPI, Time Series Analysis, Azure Application Insights, Docker, Azure Container Registry, PyTorch, DNN, Pytest, Plotly, Statistics, SQL Server 2017, SQL, Apache Spark, Containerization, DevOps, NumPy, Pandas, Neural Networks, Time Series, Algorithms, Machine Learning, Deep Learning

Senior Data Scientist

2020 - 2021
Infosys
  • Handled marketing using Dash technology and developed and deployed a business intelligence dashboard for the FMCG industry recommendation system to help marketing teams better understand and target their customers.
  • Managed a banking supply chain analysis. Given internal bank transaction data, I developed a solution for analyzing the impact of the default of some business entities on other businesses.
  • Led a PoC project for an academic institution that turned into a long-term engagement, aiming to answer questions in the natural language using a knowledge graph. The solution utilized built intent classification and named entity recognition (NER).
  • Spearheaded an unstructured data insights project. Its goal was to get insights about unstructured raw text data using techniques models like topic modeling (LDA) and sentiment analysis (BERT).
  • Conducted interviews for positions in the data science industry, such as data analysts, data scientists, and data engineers.
Technologies: BERT, Streamlit, Dash, Azure, Scikit-learn, GitLab, GitLab CI/CD, Microsoft PowerPoint, SpaCy, NetworkX, Docker, Docker Compose, Code Review, Plotly, Data Science, Artificial Intelligence (AI), DevOps, Containerization, Machine Learning, Deep Learning, PyTorch, SQL, NoSQL

Machine Learning Engineer

2019 - 2020
IamBot
  • Researched and developed a state-of-the-art NLP, conversational engine, and image representation for product recommendation solutions for chatbots.
  • Distributed GPU deep learning models training on high-volume data.
  • Handled full-stack development and maintenance of the core product. The back end in Play Framework using Scala, Spring Java, the front end in React TypeScript, and deployment of machine learning models with gRPC microservices like Python and Scala.
  • Led a sub-project for an external startup called Cypherdog. Built core application features like encryption, private key backup, and chat with Java and gRPC.
Technologies: Python, gRPC, Docker, Kubernetes, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, Chatbots, Scala, Java, Spring, Play Framework, React, TypeScript, NumPy, Pandas, Machine Learning, Deep Learning, Amazon Web Services (AWS), NoSQL, PyTorch

Software Developer

2016 - 2018
Nokia
  • Maintained and developed various internal projects with project-dependent tech stacks like Python, Django, Angular, Vue.js, and Backbone.js, along with complex testing for server, client, and e2e testing.
  • Introduced Docker into the team and developed a method for e2e testing without mocking the server-side with the dockerized environment.
  • Managed server-side administration like Nginx, repository hooks, automating builds, and CI, Docker Registry, Sentry, and Celery jobs.
Technologies: Python, Django, Django REST Framework, Angular, Vue, Backbone.js, E2E Testing, Docker, Jenkins, CI/CD Pipelines, Sentry, PostgreSQL, Celery, Bash, Linux, JavaScript, TypeScript, Test-driven Development (TDD), Scrum, DevOps, Containerization

Java Summer Trainee

2016 - 2016
Nokia
  • Developed a CI plugin automating the connection of C++ compilation errors with the person responsible for breaking the code via the version control system.
  • Deployed the plugin on Jenkins CI/CD and product maintenance.
  • Developed the custom IDE based on the IntelliJ platform for programming language TTCN-3.
Technologies: Java, Jenkins, Bash, IntelliJ IDEA

Meme Learning

https://gitlab.com/meme2vec
A meme recommendation system based on the proposed Meme2vec method, meme vectorization based on social, visual, and textual embeddings along with text-based meme classification model.

The model was further used to create a Slack and Discord bot that chooses the best matching meme template for a given text, then creates a meme and sends it back to the user. The project won an internal university poster session for data science projects and a poster session at the Polish Alliance for the Development of Artificial Intelligence (PP-RAI) conference.

License Plate Recognition

https://gitlab.com/bieruskate/license-plate-recognition
The project combined classic image processing methods for license plate and character segmentation and a custom deep learning module for character classification trained on the self-built dataset.

Its goal was to detect and recognize license plate characters in different lighting conditions with comprehensive evaluation and comparison to other available solutions.

Languages

Python, Scala, Java, TypeScript, Bash, TTCN, JavaScript, SQL, Snowflake, Groovy

Libraries/APIs

Pandas, PySpark, PyTorch, NumPy, Beautiful Soup, React, Vue, Backbone.js, Scikit-learn, SpaCy, NetworkX, Keras, TensorFlow, OpenCV, Matplotlib, XGBoost

Paradigms

Data Science, DevOps, Azure DevOps, E2E Testing, ETL, Test-driven Development (TDD), Scrum, Continuous Integration (CI)

Other

Machine Learning, Artificial Intelligence (AI), Machine Learning Operations (MLOps), Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Chatbots, MLflow, CI/CD Pipelines, BERT, Dash, Code Review, Azure Data Factory, Azure Data Lake, Delta Lake, Azure Container Instances, FastAPI, Time Series Analysis, Azure Container Registry, Statistics, Computer Networking, Data Scraping, Scraping, Web Scraping, Containerization, Neural Networks, Time Series, Algorithms, Deep Learning, Web Applications, DVC

Frameworks

Apache Spark, Spark, gRPC, Spring, Play Framework, Django, Django REST Framework, Angular, Streamlit, Scrapy

Tools

PyCharm, Git, Zsh, Docker Compose, Azure Machine Learning, Jenkins, IntelliJ IDEA, Sentry, Celery, GitLab, GitLab CI/CD, Microsoft PowerPoint, Azure Key Vault, Azure Kubernetes Service (AKS), Microsoft Power BI, Azure Application Insights, Pytest, Seaborn, Plotly, Apache Airflow, Amazon Elastic Container Service (Amazon ECS), Amazon SageMaker

Platforms

Linux, Docker, Databricks, Azure, Kubernetes, Azure Functions, Azure Synapse, DNN, Amazon Web Services (AWS), Amazon EC2

Storage

PostgreSQL, SQL Server 2017, NoSQL, Amazon S3 (AWS S3)

2018 - 2019

Master's Degree in Data Science

Wrocław University of Science and Technology - Wrocław, Poland

2014 - 2018

Bachelor's Degree in Computer Science

Wrocław University of Science and Technology - Wrocław, Poland

NOVEMBER 2021 - PRESENT

Azure Data Scientist Associate | Microsoft Certified

Microsoft

APRIL 2021 - PRESENT

Azure Fundamentals | Microsoft Certified

Microsoft

APRIL 2016 - PRESENT

CCNA Routing and Switching | Introduction to Networks

Cisco

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring