Petar Pavlovic, Developer in Zagreb, Croatia
Petar is available for hire
Hire Petar

Petar Pavlovic

Verified Expert  in Engineering

Bio

With over eight years of research and development on industrial machine learning problems, Petar has experience in implementing state-of-the-art models in image detection, classification, and segmentation. When data was not labeled, he created a labeling tool and an organized labeling pipeline; when data was insufficient, he got more data and developed a new augmentation algorithm; and when the architecture was not performing well enough, he found academic papers with different approaches.

Portfolio

Nokia - Bell Labs
Python, PyTorch, TensorFlow, Kubernetes, Computer Vision...
Visage Technologies
C++17, C, Gerrit, Git, Computer Vision, Machine Learning, Deep Learning...
Gideon Brothers
Convolutional Neural Networks (CNNs), Git, Image Recognition, Neural Networks...

Experience

  • Machine Learning - 8 years
  • Python - 8 years
  • Deep Learning - 8 years
  • Computer Vision - 8 years
  • OpenCV - 7 years
  • TensorFlow - 6 years
  • Image Segmentation - 6 years
  • Object Detection - 4 years

Availability

Part-time

Preferred Environment

Linux, TensorFlow, PyTorch

The most amazing...

...neural networks I've built are used for real-time document scanning on mobile phones, used by millions of clients all over the world.

Work Experience

AI Engineer

2024 - 2025
Nokia - Bell Labs
  • Built a custom visual marker detection system and developed a specialized object detector with custom regression. The integrated, production-ready system outperformed all prior approaches, as confirmed by client feedback.
  • Developed a synthetic data generator for visual markers.
  • Developed an image-matching system that identifies objects across different scenes using a custom similarity function, resulting in a highly accurate and efficient solution with strong potential for real-world applications.
Technologies: Python, PyTorch, TensorFlow, Kubernetes, Computer Vision, Computer Vision Algorithms, Natural Language Processing (NLP), MongoDB, MinIO, Docker, Pandas, NumPy

Team Lead

2020 - 2023
Visage Technologies
  • Developed light sword and sun glare neural network-based detectors.
  • Created a small and efficient yet accurate deep-learning light source-detector.
  • Created a new and improved light source classifier.
  • Finished large-scale codebase handover successfully.
  • Managed the annotation process with the supplier and defined a new generation annotation structure for the light source mission.
  • Worked on next-generation advanced driver-assistance systems.
Technologies: C++17, C, Gerrit, Git, Computer Vision, Machine Learning, Deep Learning, Deep Neural Networks (DNNs), Artificial Intelligence (AI), Data Science

Research Engineer

2019 - 2020
Gideon Brothers
  • Developed a depth estimation neural network with stereo video input in TensorFlow. This project included research and development of multiple state-of-the-art architectures, from Monodepth2, Struct2depth, Fast Deep Stereo to RobustMonoDE, and more.
  • Created the annotation web tool in Dash/Flask, used for depth annotations.
  • Implemented the neural network pipeline described in Fast Deep Stereo with 2D Convolutional Processing of cost signatures paper using TensorFlow and OpenCV.
Technologies: Convolutional Neural Networks (CNNs), Git, Image Recognition, Neural Networks, Deep Neural Networks (DNNs), Image Segmentation, Artificial Intelligence (AI), OpenCV, Computer Vision, Deep Learning, Machine Learning, TensorFlow, Python, Docker, Dash, PyTorch, Jupyter

Research Engineer - OCR Specialist

2016 - 2019
Microblink
  • Developed an accurate and robust ID-1 card detector neural network that works in real-time on mobile phones, developed using TensorFlow and OpenCV.
  • Developed an extremely small and accurate TensorFlow implementation of the neural network for card analysis, used for immediate user feedback.
  • Built an annotation tool for detecting blur in Dash/Flask, participated in the annotation process, and developed a robust neural network classifier in TensorFlow.
  • Explored and developed a face action recognizer using TensorFlow and a Visage Technologies face detector.
  • Researched a Croatian ID card verification through detecting hologram using Caffe for training, and Python, OpenCV, and GIMP for data augmentation.
Technologies: Convolutional Neural Networks (CNNs), Git, Image Recognition, Neural Networks, Deep Neural Networks (DNNs), Image Segmentation, Flask, Artificial Intelligence (AI), OpenCV, Computer Vision, Deep Learning, Machine Learning, TensorFlow, Python, Object Detection, Docker, Dash, PyTorch, Optical Character Recognition (OCR), Jupyter, Data Science

Junior Software Engineer

2015 - 2015
Creative Fields
  • Created a plugin interface in the cfSuite desktop application in C++.
  • Developed custom plugin creator in C++ used for the desktop cfSuite application.
  • Automated application testing procedures, used for finding bugs after updates.
Technologies: Git, Qt, C++

Experience

Tag Detector

This project focused on enhancing the detection and classification of visual markers (Tags) used across industries like robotics, drones, and logistics. These markers can be difficult to identify reliably in real-world conditions, prompting the need for a more robust solution.

The project began with a detailed analysis of existing systems, evaluating their strengths and weaknesses. Based on this research, I developed a custom solution designed to outperform current approaches. Key components included a synthetic data generator that produced realistic, varied training samples and a custom object detector with tailored regression to improve localization.

I then integrated the full pipeline—data generation, detection, and classification—into a production-ready system. The client later confirmed that the new solution outperformed all previous implementations, delivering higher accuracy and reliability.

Object Matcher

This project involved building an image-matching system capable of identifying a specific object across various images, given a few reference examples. The goal was to develop a lightweight yet effective solution for matching objects in diverse scenes, which required both robust feature extraction and intelligent comparison methods.

The system's core leveraged fast and small neural networks for feature extraction due to its efficiency and solid performance on visual tasks. To match features between reference and candidate images, I implemented a custom feature-matching algorithm, allowing for fine-tuned control over match quality.

A key challenge was isolating the target object from cluttered or complex backgrounds. To solve this, I integrated a model similar to Meta’s Segment Anything model for object segmentation, which significantly improved the quality of the features extracted and, in turn, the overall matching accuracy.

The final system performed well and showed strong potential for future use in object tracking and visual search applications.

Badminton Shuttle Tracker

The project aimed to improve the accuracy of a shuttlecock tracking system used in a popular badminton analytics product. While the product was already award-winning and trusted by professional athletes and academies, the tracking system had plateaued at around 85% accuracy, limiting its potential for delivering high-quality analytics.

My role began with a detailed analysis of the existing models and dataset to understand where improvements were needed. Based on these insights, I developed a plan that included retraining detection and tracking models with targeted adjustments, as well as expanding the dataset to improve model robustness.

To further boost performance, I implemented a post-processing pipeline that corrected and smoothed detections across both spatial dimensions. This helped fill in gaps and reduce errors, particularly during fast-paced action. Working closely with the client’s internal team, we successfully built a new generation of shuttlecock trackers, resulting in a noticeable improvement in accuracy and overall reliability.

AI Expert for Poker Game App

https://www.deepstack.ai/
Developed a real-time Poker AI using the Counterfactual Regret Minimization (CFR) algorithm. This project incorporated a machine learning model to improve decision-making beyond traditional CFR techniques. Worked on recreation of DeepStack approach.

Find Waldo Type AI Object Segmentation

Developed a versatile AI pipeline for object segmentation, inspired by the classic "Where's Waldo?" books. This pipeline utilizes deep learning techniques to identify and locate specific objects within an image automatically. Similar to how you search for Waldo in the busy illustrations, this system can be trained to detect various objects, regardless of background clutter or scene complexity.

Computer Vision Expert to Digitize Darts Game

https://www.goodtimestech.com.au/product/ultra-darts
Developed a real-time darts detection system using computer vision. This system analyzes video feeds from multiple cameras to identify dart trajectories and pinpoint their impact locations on the dartboard, enabling automated scoring and potentially offering game analysis features.

AI Expert for Healthcare Personal Assistant

Developed a novel algorithm for improving blood pressure estimation accuracy using smartphone cameras. This innovative approach leverages the power of computer vision to address limitations in existing mobile blood pressure monitoring solutions.

Full MVP Project

A custom object detector in a very specific environment and I was in charge of the process that involved the object detector model, including data gathering, setting up an annotation pipeline, managing the annotation process, developing the model, and exporting the model to the iOS app.

The client didn't have any data whatsoever. The first step was data scraping and data selection. Considering the data scraping was from Google images, there were many duplicates in the dataset. I created a small annotation tool to remove duplicates, then I defined annotation instructions and set up the annotation process. I led a team of three annotators. The model development and export to the iOS platform were the final steps.

What was particularly interesting was the timeline. The project lasted for two and a half months and was successfully delivered and well received on the demo with the end client.

Card Detector

As part of extracting information from ID-1 cards, the detector is needed and this project was intended to be the first step in extracting the card information pipeline where it's a general ID-1 card detector based on a neural network. It started as my master's thesis and grew to full-scale research.

The neural network needed to be fast enough to run on mid-range Android phones in real-time, under 1MB in size, and extremely accurate, and it needed to work on all ID-1 cards worldwide. It wasn't clear whether it was even possible.
Due to the project's strict restrictions, and since there wasn't any recipe, I started gradually. I started with meeting the accuracy goal, not worrying about size and inference speed so I could get a feeling for the problem, but also to see what accuracy is possible.

Many different approaches were explored, from detectors to segmentation. I got multiple solutions that satisfied all but one criterion. I often needed to question my assumptions. This led to several fresh starts during the project. TensorFlow and PyTorch were used for this detection problem alongside OpenCV for data augmentation.

In the end, the goal was achieved, and in mid-2019, the detector went into production.

Depth Estimation

Robust depth estimation from cameras is needed to push autonomous forklift robots further. Depth estimation is known to be quite a challenging machine-learning problem. The project aimed to find an accurate, robust, fast neural network-based solution for estimating depth. Traditional depth estimation algorithms work well in certain environments, but the industry has a much wider distribution.

Extremely hard ground truth gathering makes the problem even more complex.
A specific annotation tool was developed in Dash/Flask for a supervised approach and annotated about 40,000 images.

Several approaches were tried, from supervised to self-supervised methods. The self-supervised approach proved to be superior due to hard ground truth gathering. I managed to get accurate depth with great details from self-supervised architecture. Later, that output was used to train a smaller neural network for the final solution. TensorFlow and PyTorch were used for training, alongside OpenCV for image manipulation and Flask/Dash for an annotation web tool.

DeepCluster

https://github.com/samo1petar/deepcluster
The client needed a machine learning developer to help finish the MVP. The project involved adapting a deep cluster codebase to a customer-specific environment and running it on the specified dataset. The project can be seen in the URL link.

Shapes Detection MVP

A PowerPoint shape detector model, where I developed segmentation models that can detect various shapes. As a prerequisite for training the model, I developed the model training codebase that supports TensorFlow, PyTorch, and PyTorch Lightning libraries.

Blur Detector

The problem that was approached in this project, was to detect whether an image was sharp enough to detect letters; in other words, whether or not an image is blurry.

My first approach was to take sharp images, artificially blur some of them in OpenCV and train the classifier. I used median blur, average blur, gaussian blur, and motion blur.

Input images were downscaled to 128x128 pixels after blurring occurred. Once downscaled by the eye, blurred images were indistinguishable from nonblurred images.

The network had over 99.99% accuracy on the test set.

I decided to annotate images and create a realistic test set to be sure. I created an annotation tool in OpenCV and Python and organized the annotating process. About 33,000 images were annotated. I discovered that the network had significantly lower accuracy than the first test, at 80%. This was unexpected; it meant that the network found artificial blurring patterns, even when blurred parameters were randomly applied.

Training on real blurred images fixed the problem, and accuracy reached 97%.

Hologram Detector

The hologram detector is a Croatian ID validation project using neural networks. It's a binary classifier that says whether the Croatian ID card is valid or fake. The decision is made by detecting different patterns of the hologram. Since I had a card detector and the Croatian ID hologram was always in the same place, I focused on one part of the image.

I created a synthetic dataset as described below:

• All seven hologram patterns were photographed in high resolution.
• Holograms were drawn using GIMP.
• Very useful mug shots were used to create a realistic dataset using the GIMP Python shell and the magic wand tool. On the Croatian ID, the hologram overlaps with the face image. Faces were used to come as close to the original images as possible.
• Images were glued together by placing the hologram on top of the faces with random backgrounds.
• The noise was applied next.

Neural network classifiers were trained on synthetic images with Caffe and OpenCV. The network passed all video tests and proved the project was a success.

Education

2015 - 2017

Master's Degree in Computer Science

Faculty of Electrical Engineering and Computing, University of Zagreb - Zagreb, Croatia

2011 - 2015

Bachelor's Degree in Computer Science

Faculty of Electrical Engineering and Computing, University of Zagreb - Zagreb, Croatia

Skills

Libraries/APIs

OpenCV, PyTorch, TensorFlow, NumPy, TensorFlow Deep Learning Library (TFLearn), Pandas

Tools

Git, Jupyter, Gerrit, Confluence

Languages

Python, C++, C++17, C, SQL

Platforms

Docker, Linux, Amazon Web Services (AWS), Mobile, Kubernetes

Frameworks

Qt, Flask

Storage

MongoDB

Other

Convolutional Neural Networks (CNNs), Artificial Intelligence (AI), Image Segmentation, Deep Neural Networks (DNNs), Computer Vision, Machine Learning, Deep Learning, Object Detection, Optical Character Recognition (OCR), Computer Vision Algorithms, Data Science, Edge Computing, Image Processing, Image Recognition, Neural Networks, Clustering, Annotation Processors, Data Scraping, OpenAI, OpenAI GPT-3 API, OpenAI GPT-4 API, Point Clouds, Video Processing, Robotics, Dash, Minimum Viable Product (MVP), Startups, Medical Applications, Signal Processing, Health, Models, Natural Language Processing (NLP), MinIO

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring