Pawel Kaplanski, Developer in Sydney, New South Wales, Australia
Pawel is available for hire
Hire Pawel

Pawel Kaplanski

Verified Expert  in Engineering

Bio

Pawel is an experienced data-scientists and machine learning professional. He has worked for Fortune 100 companies, and he has an academic background in the field. Before moving to data science, he was a former lead architect in Samsung R&D Center. Pawel holds a Ph.D. in knowledge representation and reasoning as well as a master's degree and a bachelor of science degree in computer science.

Availability

Part-time

Preferred Environment

Python

The most amazing...

...thing I've coded is a Clinical Decisions Support System implementing ESMO guideline for cancer treatment.

Work Experience

Senior Machine Learning Engineer

2019 - PRESENT
Undisclosed
  • Recommended systems, image processing, NLP, and deep learning to the production.
Technologies: PyTorch, Python, TensorFlow, Minimum Viable Product (MVP), Artificial Intelligence (AI), Large Language Models (LLMs), Generative Artificial Intelligence (GenAI)

Data Scientist

2011 - PRESENT
Cognitum
  • Created machine-learning models using Sklearn and Tensorflow for Fortune 100 customer in the area of trade promotion optimization.
  • Created a cognitive programming language that makes AI programming easy allowing mixing reasoning with machine learning, used in a fraud detection system for a public institution.
  • Designed and implemented controlled natural language for formalizing the knowledge around lung cancer, used by the oncologist to formalize ESMO guidelines.
  • Created affective-computing AI models that are combining both expert knowledge and their intuitions, to calculate the quality score of complex decisions.
  • Created the novel, automated user interface synthesis algorithm in which a set of requirements is automatically translated into a working application, currently used by 30+ clinical centers and biggest telecon in Australia.
  • Created an NLP classification algorithm for legal documents corpora based on the NLTK library, constructed using mixed feature-extraction techniques: POS-Tagging, noun-phrase extraction, collocations and NER (named entity recognition), followed by Tf/Idf, feature reduction and finally the classification with Passive-Aggressive, scalable classifier.
  • Created a critical part of a tax-fraud detection system was based on natural language rules enabling decision makers and specialists to manage a tax fraud knowledge base. The stream-based reasoner allows discovering fraudulent activities in the stream of 5 million invoices per day.
Technologies: Apache Jena, Simple Knowledge Organization System (SKOS), SPARQL, Semantic Web Rule Language (SWRL), RDF, OWL, BPMN, TensorFlow, NumPy, Scikit-learn, Natural Language Toolkit (NLTK), R, Python, Minimum Viable Product (MVP), Artificial Intelligence (AI), Generative Artificial Intelligence (GenAI)

Assistant Professor

2013 - 2017
Gdansk University Of Technology, Department of Applied Informatics in Management
  • Reviewed “Government Information Quarterly, An International Journal of Information Technology Management, Policies, and Practices," IF=2.515, 5Y IF=3.161.
  • Acted as an academic visitor at the University of Newcastle, Australia.
  • Participated as a member of the EU Maria-Courie research project "Smart multipurpose knowledge administration environment for intelligent decision support systems development."
  • Reviewed and contributed to the “18th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems."
  • Served as a member of the international BRIDGE project: "CDSS for Oncology."
  • Taught the following classes: R Programming, Introduction to DataScience, Business Intelligence and BigData Processing, Software Development Process Methodology and Tools.
Technologies: Python, R, Artificial Intelligence (AI), Generative Artificial Intelligence (GenAI)

Lead Architect

2006 - 2011
Samsung
  • Led design and implementation of an industrial software stack for digital television receivers.
  • Led design and implementation of a set-top-box device emulator for efficient application level testing purposes.
  • Designed and implemented automated smoked test system with ASP.Net, MSMQ, image recognition, and remote controller emulation.
  • Technically managed a team of 30+ programmers.
  • Conducted training for newcomers about advanced multithreaded design patterns in C++.
Technologies: Embedded Systems, C++, Minimum Viable Product (MVP), Team Leadership

CDSS - Clinical Decision Supporting System

Clinical registers are needed to perform research studies and thus to increase medical knowledge that finds its way into new and improved guidelines. Adherence to clinical practice guidelines is mandatory to increase the effectiveness of treatments and to eliminate the negative consequences of medical decisions.

We organized available data into the knowledge of the diagnostic process, based on many sources like studies, publications, recommendations, so it supports doctors decisions. We also developed a central registry for collecting patient’s clinical data from over 70 oncological institutions in Poland. In production since 2016.

The results were published in Expert Systems With Applications that is currently ranked number 1 in the Google Scholar h-index listed under the top publications of artificial intelligence.

Trade Promotion Optimization

Sales analysts are responsible for providing the promotion plan for the new quarter in most of the big FMCG enterprises. Currently, these plans are created manually, mostly using conventional tools like Excel that try to answer typical TPO (trade promotion optimization) questions like:
- Can we lower overall costs by optimizing products volume sales and its promotion strategy by anticipating a promotion calendar for a given period?
- Can we predict using key indicators when and which sales pattern is the most effective and can be used to increase volume sales?
- Can we set up a useful promotion calendar for “slow-moving products”?
- Can we optimize budget KPIs when planning the next sales period?

In our case, the mis-forecasting (avg. the error was around 20%) led to budget reduction (across multiple stages within a whole supply chain). To solve the problem, we combined business knowledge of subject matter experts with historical sales data that we received. We also took into account their anomalies and outliers.

The solution allowed the company to increase its accuracy in prediction by up to 10% of volume planning.

Tax-fraud Detection on VAT

The tax-fraud detection system was based on natural language rules enabling decision makers and specialists to manage a tax fraud knowledge base. Reasoning with AI agents is used to recognize elaborate fraud and non-compliance patterns. A stream-based reasoner allows discovering fraudulent activities in the stream of 5 million invoices per day.

Automated Decision Making System

In order to sign a contract, the CEO has to analyze business situations and implement a good strategy, especially about $10+ million contracts. CEO makes highly contextual and time-sensitive decisions that have to factor in priorities, such as risk aversion or profitability. To amplify the gut feelings of the CEO, we have developed the automated decision-making system. The core of the system is based on the effective computing AI models, which are adapted to combine both expert knowledge and intuitions, in order to calculate the quality score of the deal/opportunity. In order to solve a complex problem in business, a manager needs to take into account multiple, conflicting objectives and we observed that the solution must consist of the AI models wrapped in the user-friendly UI, with drag and drop editor for tuning the expert knowledge consumed by the models. Having this done, a visualization of the results can be finally presented on a custom dashboard to the CEO.

Abusive-clause detector

Processing of large corpora of legal documents, for finding potentially abusive clauses is very resource-prone and usually requires hiring a team of lawyers. Reduction in this process can be achieved using modern NLP methods. I developed a Python classification based on the NLTK library, that was capable of automating the daily work of the client. The pipeline was constructed using classical approach, based on feature-extraction techniques like n-grams, POS-Tagging, noun-phrase extraction, collocations and NER (named entity recognition), followed by Tf/Idf, feature reduction and finally the classification with Passive-Aggressive, scalable classifier.

Cyber Assessment

Developed a tool, that is allowing encoding the knowledge of cyber-security expert for a cyber-security solution provider of a strong, policy-driven security architecture that can be enforced through defined security domains and controls, and implemented through set standards, guidelines and procedures.

The tool is allowing customers to perform guided cyber-security health check, and after the health-check is completed, the detailed report (diagnosis) is generated allowing the customer to understand the current state of the company’s cybersecurity maturity level and understand the weak points. The estimation of the potential cost of the Problem is also provided.
2009 - 2013

Ph.D. in Computer Science

Gdansk University of Technology - Gdańsk, Poland

1999 - 2001

Master of Engineering Degree in Computer Science

Wroclaw University of Technology - Wrocław, Poland

1996 - 1999

Bachelor of Engineering Degree in Computer Science

Wroclaw University of Technology - Wrocław, Poland

FEBRUARY 2018 - PRESENT

Sequence Models

Coursera

FEBRUARY 2018 - PRESENT

Deep Learning Specialization

Coursera

OCTOBER 2017 - PRESENT

Convolutional Neural Networks

Coursera

SEPTEMBER 2017 - PRESENT

Structuring Machine Learning Projects

Coursera

SEPTEMBER 2017 - PRESENT

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

Coursera

AUGUST 2017 - PRESENT

Neural Networks and Deep Learning

Coursera

FEBRUARY 2011 - PRESENT

Oracle Certified Professional, Java SE 5 Programmer

Oracle

Libraries/APIs

Natural Language Toolkit (NLTK), OWL API, TensorFlow, Scikit-learn, Keras, NumPy, Pandas, PyTorch, PySpark, SymPy, SciPy

Tools

Protégé, SikuliX, Microsoft Visual Studio, Git, Jira, OpenLink Virtuoso, Apache Solr

Languages

OWL, RDF, SPARQL, R, SQL, C++, Java, C#, Python, Semantic Web Rule Language (SWRL), JavaScript, T-SQL (Transact-SQL), UML, XML

Frameworks

Apache Jena, Ontology Framework, TinkerPop, .NET

Paradigms

Anomaly Detection, BPMN, Scrum

Platforms

Azure, Amazon EC2, Jupyter Notebook, Amazon Web Services (AWS), RStudio, Azure AI Studio

Storage

Cassandra, Titan Graph, Oracle SQL, MySQL

Other

Data Science, WordNet, Genetic Algorithms, Natural Language Processing (NLP), Machine Learning, Generative Pre-trained Transformers (GPT), Minimum Viable Product (MVP), Team Leadership, Artificial Intelligence (AI), Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), Recurrent Neural Networks (RNNs), Deep Learning, Classification Algorithms, Regression Modeling, Clustering Algorithms, Bayesian Inference & Modeling, Logistic Regression, Decision Trees, Random Forests, Markov Model, Ensemble Methods, Evolutionary Algorithms, Sesame, Data Visualization, Scalable Architecture, Time Series Analysis, Principal Component Analysis (PCA), Simple Knowledge Organization System (SKOS), Embedded Systems, Schema.org

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring