Karol Kulasiński, Developer in Warsaw, Poland
Karol is available for hire
Hire Karol

Karol Kulasiński

Verified Expert  in Engineering

Bio

Karol is a highly experienced senior data scientist with a strong focus on NLP and wide AI applications. He has a unique academic background in physics and large-scale models, in addition to relevant experience in the customer-facing data science industry. He enjoys working with data and leading and implementing R&D projects. With his PhD in physics and the recent MBA degree, Karol combines easy technology and business.

Portfolio

Fractile
Azure, Docker, Kubernetes, Generative Artificial Intelligence (GenAI)...
Warsaw Stock Exchange
Python, Databases, APIs, Django, Artificial Intelligence (AI), Machine Learning...
Sweetgreen Inc - Main
Tableau, Python, Snowflake, Data Analytics, Data Visualization, Data Science...

Experience

Availability

Full-time

Preferred Environment

Python, Docker, SQL, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Machine Learning, Data Science, Artificial Intelligence (AI), Azure, Generative Artificial Intelligence (GenAI)

The most amazing...

...project I led was the creation of autonomous AI agents as a service platform that allowed the client to use his data securely with LLMs.

Work Experience

Co-founder

2023 - PRESENT
Fractile
  • Designed and co-implemented a platform to create, customize, and apply AI agents. The agents disposed of an additional layer of security, thanks to the anonymization service that protects 100% of the data from the client's server.
  • Applied AI agents with the ability to learn on previous tasks and be spawned via Chat, API, or Jira.
  • Deployed the solution to two medium-sized companies.
Technologies: Azure, Docker, Kubernetes, Generative Artificial Intelligence (GenAI), Data Science

Machine Learning Expert and Engineering Manager

2021 - 2023
Warsaw Stock Exchange
  • Led the development process of a scalable exchange platform for personalized ads on Polish TV.
  • Designed logo placement detection AI in live-streamed TV broadcasts.
  • Led the development team of nine, implementing a supply-side platform, end-to-end, from conceptualization and MVP to a scalable production stage.
  • Created an end-to-end pipeline to train the behavioral models based on the data from TV.
Technologies: Python, Databases, APIs, Django, Artificial Intelligence (AI), Machine Learning, REST, ChatGPT, Data Analysis, FastAPI, XGBoost, Transformer Models, Machine Learning Operations (MLOps), Time Series Analysis, Time Series, Docker

Lead Data Scientist

2021 - 2023
Sweetgreen Inc - Main
  • Designed and led the implementation of the salad recommendation engine at the production level.
  • Created a BI tool with live-updated sales data and ML forecasting for the CxOs.
  • Built a PoC sales forecasting model based on historical sales and weather forecast data.
  • Improved the legacy ML production models for supply chain forecasting. Managed to improve the processing time by one order of magnitude.
Technologies: Tableau, Python, Snowflake, Data Analytics, Data Visualization, Data Science, Microsoft Power BI, AWS CLI, Docker, Datadog, Artificial Intelligence (AI), Pandas, Recommendation Systems, REST, Data Analysis, Machine Learning Operations (MLOps), Amazon SageMaker

Senior Data Scientist

2020 - 2021
Meloncast
  • Created a complete training and deploying pipeline for NLP models (BERT) to classify target audience marketing texts.
  • Trained ML models recognizing most similar pictures in terms of content and coloristic that the client provided.
  • Designed and deployed a production-level API for containerized Docker services.
Technologies: Docker, APIs, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Image Processing, Machine Learning, Data Science, NoSQL, SQL, BERT, Artificial Intelligence (AI), Pandas, REST, Data Analysis

Lead Data Scientist

2019 - 2021
Physica Solutions
  • Built an NLP ecosystem for using ChatGPT on the company's private data.
  • Created subMIND, a tool for extracting subconscious information from a large body of text that uses state-of-the-art techniques for entity recognition, graph relations, and visualizations.
  • Built Microsoft Power BI reports for a private Polish university, working directly with the business.
  • Designed an architecture for classifying fake news in social media for the most prominent Polish university, including NLP (BERT) classification, data collection, and overall flow.
Technologies: APIs, Flask, Microsoft Power BI, Amazon Web Services (AWS), Jupyter Notebook, Data Science, Machine Learning, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Docker, Statistics, Artificial Intelligence (AI), Pandas, REST, ChatGPT, Data Analysis

Lead Data Scientist

2019 - 2021
Yieldbird
  • Optimized pricing models for online ad auctions using ML tools.
  • Created an entire ML pipeline, including data ingestion, testing, prototyping, error handling, monitoring, and evaluation.
  • Directed the process of product development from the R&D side, including hypothesis testing and handling client feedback.
Technologies: Python, APIs, Docker, Google Cloud Platform (GCP), Amazon Web Services (AWS), PostgreSQL, Jupyter Notebook, Hypothesis Testing, Ads, Pipelines, Product Roadmaps, Artificial Intelligence (AI), Pandas, REST, Data Analysis

Data Scientist

2018 - 2019
DS Stream
  • Created Tableau reports identifying fraudulent behavior of employees.
  • Built a fully automated quality assurance system for data ingestion.
  • Designed a Twitter fake news detector front end for data visualization.
Technologies: Amazon Web Services (AWS), Data Science, Big Data, Spark, SQL, Tableau, Python, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Google Cloud Platform (GCP), JavaScript, HTML, Flask, Artificial Intelligence (AI), Pandas, REST, Data Analysis

Postdoctoral Researcher

2016 - 2017
Lawrence Berkeley National Lab
  • Carried out state-of-the-art research using molecular dynamics and Monte Carlo simulations on nanoscopic materials.
  • Published three technical papers in a highly respected scientific journal.
  • Created, simulated, and interpreted numerical simulations with over 10^7 degrees of freedom.
Technologies: Python, Linux, Simulations, Publication, Conference Speaking, Statistics, Pandas, Data Analysis

Doctoral Researcher

2012 - 2015
ETH Zurich
  • Carried out numerical simulations that resulted in models further used by other team members.
  • Published nine technical papers in top-ranked journals as the first author.
  • Contributed to the physical chemistry field by explaining the water adsorption-related phenomena in cellulose.
Technologies: Python, Data Science, Mathematical Analysis, MATLAB, Linux, Data Analysis

Intern

2011 - 2011
Texas A&M University
  • Created a numerical model of the secondary loop of the BWR nuclear reactor under the direction of Professor J. Ragusa.
  • Applied the Monte Carlo method for sensitivity analysis of numerical coefficients in different equation functions of the state.
  • Expanded the lab's Python library for carrying out finite element method simulations.
Technologies: Python, MATLAB, Numerical Methods, Data Science, Mathematical Analysis, Statistics, Linux, Data Analysis

Autonomous AI Agents

https://fractile.io
I designed and co-implemented a platform for creating, customizing, and applying AI agents. The agents provide an additional layer of security, thanks to the anonymization service that protects 100% of the data coming out of the client's server. The agents can learn from previous tasks and can be spawned via chat, API, or Jira.

subMIND

I created subMIND, a tool for extracting subconscious information from a large body of text. From relations between characters in a book to suspicious activities in your report files, subMIND uses state-of-the-art techniques for entity recognition, graph relations, and visualizations. This tool is intended for researchers of all kinds.

Hot Topics Classifier

An automated system for scraping, analyzing, and detecting hot topics from online articles.

I predicted the topics using the LDA method and ran collected texts through BERT that could, at the end of the day, determine what target audience does the specific text pertains to. Texts were scraped from LinkedIn and online newspapers.

PriceGenius | Ad Price Optimization

https://yieldbird.com/price-genius
I led a data science team of eight people that created the complete pipeline for a price optimization product for online ads.

The ads are first-price auctions, and we used ML techniques to find their optimum price and predict the price that would allow the end customer to maximize their revenue. Thanks to our ML models, the revenue boost was up to 10%.

I created an entire ML pipeline, including data ingestion, testing, prototyping, error handling, monitoring, and evaluation. I also directed the product development process from the R&D side, including hypothesis testing and handling client feedback.
2023 - 2023

Professional Degree in Business Administration (MBA)

Kozminski University - Warsaw, Poland

2016 - 2017

Postdoc in Computational Physics

UC Berkeley - Berkeley, CA

2012 - 2015

PhD in Physics

ETH Zurich - Zurich, Switzerland

2009 - 2011

Master's Degree in Nuclear Engineering

University Paris 11 - Paris, France

2006 - 2010

Bachelor's Degree in Physics

Warsaw University of Technology - Warsaw, Poland

MAY 2019 - PRESENT

Microsoft Data Science Certificate

Microsoft

Libraries/APIs

Pandas, XGBoost

Tools

Jupyter, Microsoft Power BI, ChatGPT, Tableau, MATLAB, AWS CLI, Azure OpenAI Service, Amazon SageMaker

Languages

Python, SQL, HTML, JavaScript, Snowflake, C++

Paradigms

REST, Management, Parallel Computing

Platforms

Jupyter Notebook, Linux, Amazon Web Services (AWS), Docker, Google Cloud Platform (GCP), Azure, Azure Functions, Kubernetes

Frameworks

Flask, Spark, Django

Storage

NoSQL, PostgreSQL, Datadog, Databases

Other

Simulations, Mathematical Analysis, Applied Physics, Natural Language Processing (NLP), Machine Learning, Data Science, Generative Pre-trained Transformers (GPT), Artificial Intelligence (AI), Data Engineering, Data Analysis, Applied Mathematics, Data Science, Hypothesis Testing, Statistics, Publication, Conference Speaking, Recommendation Systems, APIs, Ads, Pipelines, Product Roadmaps, Image Processing, BERT, Big Data, Web Scraping, Numerical Methods, Data Analytics, Data Visualization, Finance, Strategic Planning, Human Resources (HR), Accounts, Negotiation, Generative Artificial Intelligence (GenAI), Visualization, Large Language Models (LLMs), LangChain, FastAPI, Transformer Models, Machine Learning Operations (MLOps), Time Series Analysis, Time Series

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring