
Petar Pavlovic
Verified Expert in Engineering
Machine Learning Developer
Zagreb, Croatia
Toptal member since May 18, 2020
With over eight years of research and development on industrial machine learning problems, Petar has experience in implementing state-of-the-art models in image detection, classification, and segmentation. When data was not labeled, he created a labeling tool and an organized labeling pipeline; when data was insufficient, he got more data and developed a new augmentation algorithm; and when the architecture was not performing well enough, he found academic papers with different approaches.
Portfolio
Experience
- Machine Learning - 8 years
- Python - 8 years
- Deep Learning - 8 years
- Computer Vision - 8 years
- OpenCV - 7 years
- TensorFlow - 6 years
- Image Segmentation - 6 years
- Object Detection - 4 years
Availability
Preferred Environment
Linux, TensorFlow, PyTorch
The most amazing...
...neural networks I've built are used for real-time document scanning on mobile phones, used by millions of clients all over the world.
Work Experience
AI Engineer
Nokia - Bell Labs
- Built a custom visual marker detection system and developed a specialized object detector with custom regression. The integrated, production-ready system outperformed all prior approaches, as confirmed by client feedback.
- Developed a synthetic data generator for visual markers.
- Developed an image-matching system that identifies objects across different scenes using a custom similarity function, resulting in a highly accurate and efficient solution with strong potential for real-world applications.
Team Lead
Visage Technologies
- Developed light sword and sun glare neural network-based detectors.
- Created a small and efficient yet accurate deep-learning light source-detector.
- Created a new and improved light source classifier.
- Finished large-scale codebase handover successfully.
- Managed the annotation process with the supplier and defined a new generation annotation structure for the light source mission.
- Worked on next-generation advanced driver-assistance systems.
Research Engineer
Gideon Brothers
- Developed a depth estimation neural network with stereo video input in TensorFlow. This project included research and development of multiple state-of-the-art architectures, from Monodepth2, Struct2depth, Fast Deep Stereo to RobustMonoDE, and more.
- Created the annotation web tool in Dash/Flask, used for depth annotations.
- Implemented the neural network pipeline described in Fast Deep Stereo with 2D Convolutional Processing of cost signatures paper using TensorFlow and OpenCV.
Research Engineer - OCR Specialist
Microblink
- Developed an accurate and robust ID-1 card detector neural network that works in real-time on mobile phones, developed using TensorFlow and OpenCV.
- Developed an extremely small and accurate TensorFlow implementation of the neural network for card analysis, used for immediate user feedback.
- Built an annotation tool for detecting blur in Dash/Flask, participated in the annotation process, and developed a robust neural network classifier in TensorFlow.
- Explored and developed a face action recognizer using TensorFlow and a Visage Technologies face detector.
- Researched a Croatian ID card verification through detecting hologram using Caffe for training, and Python, OpenCV, and GIMP for data augmentation.
Junior Software Engineer
Creative Fields
- Created a plugin interface in the cfSuite desktop application in C++.
- Developed custom plugin creator in C++ used for the desktop cfSuite application.
- Automated application testing procedures, used for finding bugs after updates.
Experience
Tag Detector
The project began with a detailed analysis of existing systems, evaluating their strengths and weaknesses. Based on this research, I developed a custom solution designed to outperform current approaches. Key components included a synthetic data generator that produced realistic, varied training samples and a custom object detector with tailored regression to improve localization.
I then integrated the full pipeline—data generation, detection, and classification—into a production-ready system. The client later confirmed that the new solution outperformed all previous implementations, delivering higher accuracy and reliability.
Object Matcher
The system's core leveraged fast and small neural networks for feature extraction due to its efficiency and solid performance on visual tasks. To match features between reference and candidate images, I implemented a custom feature-matching algorithm, allowing for fine-tuned control over match quality.
A key challenge was isolating the target object from cluttered or complex backgrounds. To solve this, I integrated a model similar to Meta’s Segment Anything model for object segmentation, which significantly improved the quality of the features extracted and, in turn, the overall matching accuracy.
The final system performed well and showed strong potential for future use in object tracking and visual search applications.
Badminton Shuttle Tracker
My role began with a detailed analysis of the existing models and dataset to understand where improvements were needed. Based on these insights, I developed a plan that included retraining detection and tracking models with targeted adjustments, as well as expanding the dataset to improve model robustness.
To further boost performance, I implemented a post-processing pipeline that corrected and smoothed detections across both spatial dimensions. This helped fill in gaps and reduce errors, particularly during fast-paced action. Working closely with the client’s internal team, we successfully built a new generation of shuttlecock trackers, resulting in a noticeable improvement in accuracy and overall reliability.
AI Expert for Poker Game App
https://www.deepstack.ai/Find Waldo Type AI Object Segmentation
Computer Vision Expert to Digitize Darts Game
https://www.goodtimestech.com.au/product/ultra-dartsAI Expert for Healthcare Personal Assistant
Full MVP Project
The client didn't have any data whatsoever. The first step was data scraping and data selection. Considering the data scraping was from Google images, there were many duplicates in the dataset. I created a small annotation tool to remove duplicates, then I defined annotation instructions and set up the annotation process. I led a team of three annotators. The model development and export to the iOS platform were the final steps.
What was particularly interesting was the timeline. The project lasted for two and a half months and was successfully delivered and well received on the demo with the end client.
Card Detector
The neural network needed to be fast enough to run on mid-range Android phones in real-time, under 1MB in size, and extremely accurate, and it needed to work on all ID-1 cards worldwide. It wasn't clear whether it was even possible.
Due to the project's strict restrictions, and since there wasn't any recipe, I started gradually. I started with meeting the accuracy goal, not worrying about size and inference speed so I could get a feeling for the problem, but also to see what accuracy is possible.
Many different approaches were explored, from detectors to segmentation. I got multiple solutions that satisfied all but one criterion. I often needed to question my assumptions. This led to several fresh starts during the project. TensorFlow and PyTorch were used for this detection problem alongside OpenCV for data augmentation.
In the end, the goal was achieved, and in mid-2019, the detector went into production.
Depth Estimation
Extremely hard ground truth gathering makes the problem even more complex.
A specific annotation tool was developed in Dash/Flask for a supervised approach and annotated about 40,000 images.
Several approaches were tried, from supervised to self-supervised methods. The self-supervised approach proved to be superior due to hard ground truth gathering. I managed to get accurate depth with great details from self-supervised architecture. Later, that output was used to train a smaller neural network for the final solution. TensorFlow and PyTorch were used for training, alongside OpenCV for image manipulation and Flask/Dash for an annotation web tool.
DeepCluster
https://github.com/samo1petar/deepclusterShapes Detection MVP
Blur Detector
My first approach was to take sharp images, artificially blur some of them in OpenCV and train the classifier. I used median blur, average blur, gaussian blur, and motion blur.
Input images were downscaled to 128x128 pixels after blurring occurred. Once downscaled by the eye, blurred images were indistinguishable from nonblurred images.
The network had over 99.99% accuracy on the test set.
I decided to annotate images and create a realistic test set to be sure. I created an annotation tool in OpenCV and Python and organized the annotating process. About 33,000 images were annotated. I discovered that the network had significantly lower accuracy than the first test, at 80%. This was unexpected; it meant that the network found artificial blurring patterns, even when blurred parameters were randomly applied.
Training on real blurred images fixed the problem, and accuracy reached 97%.
Hologram Detector
I created a synthetic dataset as described below:
• All seven hologram patterns were photographed in high resolution.
• Holograms were drawn using GIMP.
• Very useful mug shots were used to create a realistic dataset using the GIMP Python shell and the magic wand tool. On the Croatian ID, the hologram overlaps with the face image. Faces were used to come as close to the original images as possible.
• Images were glued together by placing the hologram on top of the faces with random backgrounds.
• The noise was applied next.
Neural network classifiers were trained on synthetic images with Caffe and OpenCV. The network passed all video tests and proved the project was a success.
Education
Master's Degree in Computer Science
Faculty of Electrical Engineering and Computing, University of Zagreb - Zagreb, Croatia
Bachelor's Degree in Computer Science
Faculty of Electrical Engineering and Computing, University of Zagreb - Zagreb, Croatia
Skills
Libraries/APIs
OpenCV, PyTorch, TensorFlow, NumPy, TensorFlow Deep Learning Library (TFLearn), Pandas
Tools
Git, Jupyter, Gerrit, Confluence
Languages
Python, C++, C++17, C, SQL
Platforms
Docker, Linux, Amazon Web Services (AWS), Mobile, Kubernetes
Frameworks
Qt, Flask
Storage
MongoDB
Other
Convolutional Neural Networks (CNNs), Artificial Intelligence (AI), Image Segmentation, Deep Neural Networks (DNNs), Computer Vision, Machine Learning, Deep Learning, Object Detection, Optical Character Recognition (OCR), Computer Vision Algorithms, Data Science, Edge Computing, Image Processing, Image Recognition, Neural Networks, Clustering, Annotation Processors, Data Scraping, OpenAI, OpenAI GPT-3 API, OpenAI GPT-4 API, Point Clouds, Video Processing, Robotics, Dash, Minimum Viable Product (MVP), Startups, Medical Applications, Signal Processing, Health, Models, Natural Language Processing (NLP), MinIO
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring