Przemysław Przybyszewski, Developer in Warsaw, Poland
Przemysław is available for hire
Hire Przemysław

Przemysław Przybyszewski

Verified Expert  in Engineering

Bio

Przemysław has a Ph.D. in economics and a Master's degree in data science. He enjoys developing AI projects and has an exceptional understanding of how to use data to generate profitable solutions to some of the industries toughest problems. He co-authored a research paper presented at the MICCAI conference (the premier international conference in information processing, machine learning, and computational modeling) and developed an anti-fraud user behavior anomaly detection algorithm.

Portfolio

Cherrypick Games
Data Engineering, Google Cloud Platform (GCP), Deep Learning, Data Science...
BJ's Wholesale Club - Marketing/Analytics
SQL, Data Engineering, Amazon Web Services (AWS)...
SmarterDiagnostics
AI Design, Python, Generative Artificial Intelligence (GenAI), AWS Lambda...

Experience

  • Artificial Intelligence (AI) - 5 years
  • Machine Learning - 5 years
  • Statistics - 4 years
  • Deep Learning - 3 years
  • Data Engineering - 3 years
  • Natural Language Processing (NLP) - 2 years
  • Speech Recognition - 2 years
  • Speech to Text - 1 year

Availability

Part-time

Preferred Environment

Windows, Vim Text Editor, Visual Studio Code (VS Code), MacOS, IntelliJ IDEA, PyCharm, Linux

The most amazing...

...architecture I designed with a team of developers that I led was for a software product that enabled auditing and a flow of data science projects.

Work Experience

Software Developer

2017 - PRESENT
Cherrypick Games
  • Developed a deep learning model to analyze the sequence of in-game events to predict the chances of a given user being a potential spender in the game.
  • Prepared the architecture and implementation of the entire data infrastructure, including a data-lake from different sources through DataFlow in BigQuery, scheduling queries for data management, and a BI board for administration.
  • Designed multiple ad-hock queries to support management's strategic decisions regarding mobile game development.
Technologies: Data Engineering, Google Cloud Platform (GCP), Deep Learning, Data Science, Machine Learning, Python, Predictive Modeling, AI Design, Google BigQuery, Big Data, Data Analysis

Data Architect/Engineer

2023 - 2024
BJ's Wholesale Club - Marketing/Analytics
  • Designed and implemented the initial version of a refactor MLOps pipeline, enabling model deployment, retraining, and inference triggered by code changes or data events.
  • Managed a roadmap for development features in acquisition and personalization engines, overseeing bug fixes, feature ticket preparation, PR reviews, and ensuring compliance with functional and non-functional requirements.
  • Led a team of 4 developers, providing guidance, removing blockers, and reviewing their work to ensure high-quality implementation.
  • Optimized ETL processes by migrating to AWS Glue and EMR serverless for cost efficiency and fine-tuning Spark jobs based on execution plans to enhance performance.
Technologies: SQL, Data Engineering, Amazon Web Services (AWS), Amazon Elastic MapReduce (EMR), Spark, AI Design, Amazon DynamoDB, Python, Amazon SageMaker, Machine Learning Operations (MLOps), Data Analysis, Forecasting

Software Architect | Back-end and DevOps Engineer

2022 - 2024
SmarterDiagnostics
  • Built a platform with customized OHIF as the front end. Radiologists can run AI-assisted evaluations of different parameters and metrics of MRI scans of the Achilles tendon.
  • Made a Generative AI tool to collect data from radiologists's reports and chats to gather information needed to train models for evaluating the achilles tendon recovery process.
  • Created a CI/CD through Gitlab for the platform and model management on AWS. Models are retrained on a predefined schedule or based on a new dataset trigger. Training is happening on on-prem GPUs, AWS, and GCP. Final models are stored on AWS S3.
Technologies: AI Design, Python, Generative Artificial Intelligence (GenAI), AWS Lambda, Kubernetes, AWS Step Functions, Amazon DynamoDB, Java, Apache Airflow, GitLab CI/CD, Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT), TensorFlow, PyTorch, Machine Learning Operations (MLOps), SQL

Software Architect | Back-end Engineer

2021 - 2023
Lumilook
  • Prepared a GenAI tool, which generated safety recommendations for the company based on the statistics of occurring events, their location, and data gathered from safety managers through AI-assisted conversation and past safety reports.
  • Designed and implemented the back end for processing streaming data, analyzing them in tumbling windows to generate real-time alerts for safety managers. Also prepared a device capable of collecting data from CCTV cameras and running AI inference.
  • Prepared an API service to visualize safety incidents across different places in the warehouse at different times.
Technologies: AWS IoT, AWS IoT Greengrass, Amazon Kinesis, Amazon Timestream, Amazon DynamoDB, Generative Artificial Intelligence (GenAI), Machine Learning Operations (MLOps), Python, Java, AWS Step Functions, Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT), TensorFlow, PyTorch, Data Analysis

Security Software Engineer

2019 - 2020
ByteDance AI Lab
  • Developed an anti-fraud user behavior anomaly detection algorithm by which we could effectively block IPs used by bots.
  • Prepared a POC of an AI-powered WAF and IDS in the company's internal cloud environment.
  • Explored the usage of eBPFs in enabling real-time network traffic analysis with the use of machine learning models from the user space.
Technologies: Kubernetes, ClickHouse, Flink, Spark, Java, Scala, Go, Python, Predictive Modeling, AI Design, Amazon Athena, Big Data, TensorFlow, PyTorch, Machine Learning Operations (MLOps), SQL, Monte Carlo Simulations, Forecasting

Senior Software Developer

2017 - 2019
Deepsense a.i.
  • Implemented multiple features in Scala/Java in a Kubernetes environment for a Neptune project; a machine learning experiment management tool.
  • Implemented and maintained multiple microservices (Go, Python, Java) for a product; a one-click deployment script for preparing a cloud-agnostic environment (worked on GCP, AWS, and Azure) for data scientists.
  • Assisted with an AI pipeline for generating a list of ingredients from the images of FMCG products.
Technologies: PyTorch, Keras, Google Cloud Platform (GCP), Spark, Scala, Java, Kubernetes, Python, Predictive Modeling, Google BigQuery, Big Data, Artificial Intelligence (AI), TensorFlow, SQL, Data Analysis

Member of the Research Team

2018 - 2018
Interdisciplinary Centre for Mathematical and Computational Modelling
  • Prepared a multimodal deep learning model for estimating the healing progress of the Achilles tendon based on the sequence of US and MRI scans.
  • Prepared two microservices (Java) for the data management of model training and experiment tracking.
  • Co-authored a research paper presented at the MICCAI conference. (The premier international conference in information processing, machine learning, and computational modeling in medical image computing and computer assisted interventions).
Technologies: Data Science, Machine Learning, Deep Learning, Java, Python, Artificial Intelligence (AI), Forecasting

Experience

Context Cartographer

I prepared the architecture and led a team of developers on this project. It enabled auditing a flow of data science projects. Every operation related to the data science project (starting from data ingestion to running model inference by external users) had to be intercepted and properly handled, enabling the system administrator to easily track down what was happening during the given data science project. The implementation assumed that a graph and full-text search databases would be enough for explainability and trackability.

Data Architect/Engineer

I designed the architecture, developed machine learning and deep learning models, implemented features, and optimized and maintained two projects (personalization engine and acquisition engine) in eCommerce. The entire architecture is hosted on AWS, utilizing services such as EMR, EC2, SageMaker, Step Functions, S3, and Quicksight. Additionally, I ran ETL pipelines using Python and Spark and wrote the whole application in Python.

Chatbot for Serving Loans for Construction Developers

In the project, I have used OpenAI stack to create an automated voicemail system to gather the necessary loan information. The system transcribes and adjusts the voice, capturing all required details. In cases where information is missing, the bot prompts the caller with specific questions to fill in the gaps. This streamlined process allows the internal team to analyze the data and prepare a tailored loan offer.

Education

2018 - 2020

Ph.D. in Economics

Warsaw School of Economics - Warsaw, Poland

2015 - 2018

Master's Degree in Data Science

Warsaw School of Economics - Warsaw, Poland

2014 - 2017

Bachelor's Degree in Computer Science

University of Warsaw - Warsaw, Poland

2011 - 2014

Bachelor's Degree in Quantitative Methods in Eonomics

Warsaw School of Economics - Warsaw, Poland

Certifications

JUNE 2023 - DECEMBER 2025

Google Cloud Certified Professional Data Engineering

Google

Skills

Libraries/APIs

TensorFlow, Keras, PyTorch, Pandas, Scikit-learn, NumPy

Tools

Amazon Athena, PyCharm, IntelliJ IDEA, Vim Text Editor, Flink, ChatGPT, AWS Step Functions, Apache Airflow, GitLab CI/CD, Amazon Elastic MapReduce (EMR), Amazon SageMaker, Google Cloud Dataproc, Cloud Dataflow, Google AI Platform

Languages

Python, Java, Go, Scala, SQL, R

Frameworks

Spark

Platforms

Google Cloud Platform (GCP), Kubernetes, Linux, Visual Studio Code (VS Code), Amazon Web Services (AWS), AWS Lambda, AWS IoT, AWS IoT Greengrass

Paradigms

Concurrent Programming

Storage

ClickHouse, Amazon DynamoDB, Google Cloud SQL

Other

Artificial Intelligence (AI), Machine Learning Operations (MLOps), Generative Pre-trained Transformers (GPT), Forecasting, Data Engineering, Machine Learning, Data Science, Statistics, Fine-tuning, Natural Language Processing (NLP), Speech Recognition, Convolutional Neural Networks (CNNs), Neural Networks, Predictive Modeling, AI Design, Google BigQuery, Big Data, Data Analysis, Finance, Monte Carlo Simulations, Financial Modeling, Bayesian Inference & Modeling, Bayesian Statistics, Deep Learning, Reinforcement Learning, Deep Reinforcement Learning, Stable Diffusion, LoRa, Speech to Text, Text to Speech (TTS), OpenAI, Generative Artificial Intelligence (GenAI), Amazon Kinesis, Amazon Timestream, Google Cloud ML

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring