Przemysław Przybyszewski, Developer in Warsaw, Poland
Przemysław is available for hire
Hire Przemysław

Przemysław Przybyszewski

Bio

Przemysław has a Ph.D. in economics and a Master's degree in data science. He enjoys developing AI projects and has an exceptional understanding of how to use data to generate profitable solutions to some of the industries toughest problems. He co-authored a research paper presented at the MICCAI conference (the premier international conference in information processing, machine learning, and computational modeling) and developed an anti-fraud user behavior anomaly detection algorithm.

Portfolio

Self employment
AI Design, Deep Reinforcement Learning, Deep Learning...
Cherrypick Games
Data Engineering, Google Cloud Platform (GCP), Deep Learning, Data Science...
Stampli
Amazon OpenSearch, AWS Lambda, Amazon DynamoDB, Artificial Intelligence (AI)...

Experience

  • Artificial Intelligence (AI) - 5 years
  • Machine Learning - 5 years
  • Statistics - 4 years
  • Deep Learning - 3 years
  • Data Engineering - 3 years
  • Natural Language Processing (NLP) - 2 years
  • Speech Recognition - 2 years
  • Speech-to-Text (STT) - 1 year

Preferred Environment

Windows, Vim Text Editor, Visual Studio Code (VS Code), MacOS, IntelliJ IDEA, PyCharm, Linux

The most amazing...

...architecture I designed with a team of developers that I led was for a software product that enabled auditing and a flow of data science projects.

Work Experience

AI/Data Architect and Engineer | Data Scientist

2023 - PRESENT
Self employment
  • Developed an LLMOps pipeline and prepared multiple agentic workflows (models both trained in-house and 3rd-party provided), speeding up the software development cycle (AI-driven code reviews, initial PR for the tickets, code template generation).
  • Developed AI agentic workflows helping out the GTM team in their daily routine tasks (RFC evaluation and draft preparation, internal-resource agentic chatbot, opportunity price evaluator).
  • Provided an LLM agentic pipeline for extracting invoice data from an invoice page in any format. Achieved around 92% accuracy on all invoice fields (monthly volume of 1.5 million invoices).
Technologies: AI Design, Deep Reinforcement Learning, Deep Learning, Machine Learning Operations (MLOps), Machine Learning, Data Science, Data Engineering, Software Development, Python, Go, Amazon Web Services (AWS), Google Cloud Platform (GCP)

Software Developer

2017 - PRESENT
Cherrypick Games
  • Developed a deep learning model to analyze the sequence of in-game events to predict the chances of a given user being a potential spender in the game.
  • Prepared the architecture and implementation of the entire data infrastructure, including a data-lake from different sources through DataFlow in BigQuery, scheduling queries for data management, and a BI board for administration.
  • Designed multiple ad-hock queries to support management's strategic decisions regarding mobile game development.
Technologies: Data Engineering, Google Cloud Platform (GCP), Deep Learning, Data Science, Machine Learning, Python, Predictive Modeling, AI Design, Google BigQuery, Big Data, Data Analysis

AI Engineer

2024 - 2025
Stampli
  • Worked on a real-time pipeline enabling the extraction of information from an invoice (OCRing the incoming files and running agentic AI workflows/LLMs/RAGs containing historical data) to ensure the information extracted from the invoice is correct.
  • Built the initial system design for AI-assisted proposal generation for procurement requests (automatic approver suggestion, product proposal).
  • Contributed to the deployment and enhancement of a Python FastAPI service that can handle the load of real-time processing (various dimensions) of tens of thousands of invoices per day.
Technologies: Amazon OpenSearch, AWS Lambda, Amazon DynamoDB, Artificial Intelligence (AI), Vector Search, Python

Data Architect/Engineer

2023 - 2024
BJ's Wholesale Club - Marketing/Analytics
  • Designed and implemented the initial version of a refactor MLOps pipeline, enabling model deployment, retraining, and inference triggered by code changes or data events.
  • Managed a roadmap for development features in acquisition and personalization engines, overseeing bug fixes, feature ticket preparation, PR reviews, and ensuring compliance with functional and non-functional requirements.
  • Led a team of 4 developers, providing guidance, removing blockers, and reviewing their work to ensure high-quality implementation.
  • Optimized ETL processes by migrating to AWS Glue and EMR serverless for cost efficiency and fine-tuning Spark jobs based on execution plans to enhance performance.
Technologies: SQL, Data Engineering, Amazon Web Services (AWS), Amazon Elastic MapReduce (EMR), Spark, AI Design, Amazon DynamoDB, Python, Amazon SageMaker, Machine Learning Operations (MLOps), Data Analysis, Forecasting

Software Architect | Back-end Engineer

2021 - 2023
Lumilook
  • Prepared a GenAI tool, which generated safety recommendations for the company based on the statistics of occurring events, their location, and data gathered from safety managers through AI-assisted conversation and past safety reports.
  • Designed and implemented the back end for processing streaming data, analyzing them in tumbling windows to generate real-time alerts for safety managers. Also prepared a device capable of collecting data from CCTV cameras and running AI inference.
  • Prepared an API service to visualize safety incidents across different places in the warehouse at different times.
Technologies: AWS IoT, AWS IoT Greengrass, Amazon Kinesis, Amazon Timestream, Amazon DynamoDB, Generative Artificial Intelligence (GenAI), Machine Learning Operations (MLOps), Python, Java, AWS Step Functions, Artificial Intelligence (AI), Generative Pre-trained Transformers (GPT), TensorFlow, PyTorch, Data Analysis

Security Software Engineer

2019 - 2020
ByteDance AI Lab
  • Developed an anti-fraud user behavior anomaly detection algorithm by which we could effectively block IPs used by bots.
  • Prepared a POC of an AI-powered WAF and IDS in the company's internal cloud environment.
  • Explored the usage of eBPFs in enabling real-time network traffic analysis with the use of machine learning models from the user space.
Technologies: Kubernetes, ClickHouse, Flink, Spark, Java, Scala, Go, Python, Predictive Modeling, AI Design, Amazon Athena, Big Data, TensorFlow, PyTorch, Machine Learning Operations (MLOps), SQL, Monte Carlo Simulations, Forecasting

Senior Software Developer

2017 - 2019
deepsense.ai..
  • Implemented multiple features in Scala/Java in a Kubernetes environment for a Neptune project, a machine learning experiment management tool.
  • Implemented and maintained multiple microservices (Go, Python, Java) for a product; a one-click deployment script for preparing a cloud-agnostic environment (worked on GCP, AWS, and Azure) for data scientists.
  • Assisted with an AI pipeline for generating a list of ingredients from the images of FMCG products (extracting ROI on images through Fast and FasterRCNN, running OCR, and then applying FastText on the returned content to get the ingredients list).
Technologies: PyTorch, Keras, Google Cloud Platform (GCP), Spark, Scala, Java, Kubernetes, Python, Predictive Modeling, Google BigQuery, Big Data, Artificial Intelligence (AI), TensorFlow, SQL, Data Analysis

Member of the Research Team

2018 - 2018
Interdisciplinary Centre for Mathematical and Computational Modelling
  • Prepared a multimodal deep learning model for estimating the healing progress of the Achilles tendon based on the sequence of US and MRI scans.
  • Prepared two microservices (Java) for the data management of model training and experiment tracking.
  • Co-authored a research paper presented at the MICCAI conference. (The premier international conference in information processing, machine learning, and computational modeling in medical image computing and computer assisted interventions).
Technologies: Data Science, Machine Learning, Deep Learning, Java, Python, Artificial Intelligence (AI), Forecasting

Experience

Context Cartographer

I prepared the architecture and led a team of developers on this project. It enabled auditing a flow of data science projects. Every operation related to the data science project (starting from data ingestion to running model inference by external users) had to be intercepted and properly handled, enabling the system administrator to easily track down what was happening during the given data science project. The implementation assumed that a graph and full-text search databases would be enough for explainability and trackability.

Data Architect/Engineer

I designed the architecture, developed machine learning and deep learning models, implemented features, and optimized and maintained two projects (personalization engine and acquisition engine) in eCommerce. The entire architecture is hosted on AWS, utilizing services such as EMR, EC2, SageMaker, Step Functions, S3, and Quicksight. Additionally, I ran ETL pipelines using Python and Spark and wrote the whole application in Python.

Chatbot for Serving Loans for Construction Developers

In the project, I have used OpenAI stack to create an automated voicemail system to gather the necessary loan information. The system transcribes and adjusts the voice, capturing all required details. In cases where information is missing, the bot prompts the caller with specific questions to fill in the gaps. This streamlined process allows the internal team to analyze the data and prepare a tailored loan offer.

Education

2018 - 2020

Ph.D. in Economics

Warsaw School of Economics - Warsaw, Poland

2015 - 2018

Master's Degree in Data Science

Warsaw School of Economics - Warsaw, Poland

2014 - 2017

Bachelor's Degree in Computer Science

University of Warsaw - Warsaw, Poland

2011 - 2014

Bachelor's Degree in Quantitative Methods in Eonomics

Warsaw School of Economics - Warsaw, Poland

Certifications

JUNE 2023 - DECEMBER 2025

Google Cloud Certified Professional Data Engineering

Google

Skills

Libraries/APIs

TensorFlow, Keras, PyTorch, Pandas, Scikit-learn, NumPy

Tools

Amazon Athena, PyCharm, IntelliJ IDEA, Vim Text Editor, Flink, ChatGPT, AWS Step Functions, Apache Airflow, GitLab CI/CD, Amazon Elastic MapReduce (EMR), Amazon SageMaker, Google Cloud Dataproc, Cloud Dataflow, Google AI Platform, Amazon OpenSearch

Languages

Python, Java, Go, Scala, SQL, R

Paradigms

Automation, DevOps, Concurrent Programming

Frameworks

Spark

Platforms

Google Cloud Platform (GCP), Kubernetes, Linux, Visual Studio Code (VS Code), Amazon Web Services (AWS), AWS Lambda, AWS IoT, AWS IoT Greengrass

Storage

ClickHouse, Amazon DynamoDB, Google Cloud SQL

Other

Artificial Intelligence (AI), Machine Learning Operations (MLOps), Generative Pre-trained Transformers (GPT), Forecasting, API Integration, Data Analytics, Data Engineering, Machine Learning, Data Science, Statistics, Fine-tuning, Natural Language Processing (NLP), Speech Recognition, Convolutional Neural Networks (CNNs), Neural Networks, Predictive Modeling, AI Design, Google BigQuery, Big Data, Data Analysis, Finance, Monte Carlo Simulations, Financial Modeling, RAG Systems, Large Language Models (LLMs), Large Language Model Operations (LLMOps), Bayesian Inference & Modeling, Bayesian Statistics, Deep Learning, Reinforcement Learning, Deep Reinforcement Learning, Stable Diffusion, LoRa, Speech-to-Text (STT), Text-to-Speech (TTS), OpenAI, Generative Artificial Intelligence (GenAI), Amazon Kinesis, Amazon Timestream, Google Cloud ML, Software Development, Vector Search

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring