Petar is available for hire

Petar Pavlovic

Verified Expert in Engineering

AI Engineer and Developer

Zagreb, Croatia

Toptal member since May 18, 2020

Expertise

Artificial Intelligence Computer Vision Machine Learning Deep Learning LLM Data Scraping RAG NumPy Python OpenCV TensorFlow PyTorch AWS Neural Network

Bio

Petar is an AI engineer with 10+ years of production ML experience, focusing on LLM applications, RAG systems, and AI agents tailored for enterprise clients. With a background in computer vision and industrial R&D at Nokia Bell Labs, Qualcomm, Microblink, and Gideon Brothers, he excels in building end-to-end infrastructure from prototype to production.

Portfolio

GoalGetter

Claude API, AI Prompts, FastAPI, WebSockets, MongoDB, Redis, Docker, SendGrid...

SpacePixel

LangChain, FAISS, Hugging Face, Gradio, Document Processing, OpenAI API...

CorVita Medical

OpenAI API, OpenAI SDK, OpenAI, LangGraph, FAISS, Beautiful Soup, FastAPI...

Experience

Computer Vision - 10 years
Deep Learning - 8 years
OpenCV - 7 years
Image Segmentation - 6 years
Object Detection - 4 years
Amazon Web Services (AWS) - 4 years
Visual Language Models (VLMs) - 2 years
AI Agents - 2 years

Preferred Environment

LangGraph, Amazon Web Services (AWS), OpenAI SDK, Anthropic, PyTorch

The most amazing...

...AI solutions I've built range from neural networks running on millions of phones to LLM systems that help medical professionals navigate research.

Work Experience

AI Engineer

2025 - 2026

GoalGetter

Architected a real-time AI coaching system with WebSocket-powered chat, enabling seamless conversation flow between users and a custom persona AI coach built on the Claude API.
Implemented a two-phase engagement workflow with gated chat access, transitioning users from unlimited goal-setting sessions to structured accountability meetings on fixed intervals.
Built an automated scheduling and notification system using Celery background tasks, Google Calendar integration, and SendGrid email reminders to keep users engaged with their goals.
Delivered a production-ready platform with JWT/OAuth authentication, Redis-backed rate limiting, structured logging, and containerized deployment via Docker Compose.

Technologies: Claude API, AI Prompts, FastAPI, WebSockets, MongoDB, Redis, Docker, SendGrid, Google Calendar API, OpenAI API, OpenAI SDK, Anthropic, Claude, Generative Artificial Intelligence (GenAI), AI Agents, APIs

LLM Engineer

2025 - 2025

SpacePixel

Built a personal knowledge base RAG system that accurately represents professional experience, projects, and technical skills through natural conversation.
Deployed a public-facing AI assistant on Hugging Face Spaces, providing always-available access for recruiters and potential clients to explore clients' backgrounds.
Implemented grounded response generation, ensuring the chatbot provides accurate, verifiable information about work history rather than hallucinated details.

Technologies: LangChain, FAISS, Hugging Face, Gradio, Document Processing, OpenAI API, Generative Artificial Intelligence (GenAI), AI Agents

AI Engineer

2025 - 2025

CorVita Medical

Developed an automated research ingestion pipeline that continuously scrapes and indexes new cardiology publications from user-defined subfields, keeping the knowledge base current without manual intervention.
Built a semantic search and cross-referencing system across a large medical corpus, enabling clinicians to quickly surface relevant information from textbooks and papers in a single query.
Implemented user data analysis workflows that allow medical professionals to upload proprietary datasets and receive automated, data-driven insights tailored to their research questions.

Technologies: OpenAI API, OpenAI SDK, OpenAI, LangGraph, FAISS, Beautiful Soup, FastAPI, Docker, Amazon Web Services (AWS), API Integration, Artificial Intelligence (AI), Generative Artificial Intelligence (GenAI), AI Agents

Python Computer Vision Machine Learning Developer

2025 - 2025

Global Gains, Inc.

Developed a pipeline for player identification and tracking.
Built a basketball ball tracking and a basketball rim tracker.
Created a comprehensive document describing the next steps and work needed to develop a reliable system with VLMs and without VLMs.

Technologies: Computer Vision, Python, PyTorch, YOLOv5, DeepSORT, OpenCV, NumPy, Machine Learning, YOLOv8, ByteTrack, Large Language Models (LLMs), Visual Language Models (VLMs), Generative Artificial Intelligence (GenAI), APIs

Senior Computer Vision Specialist

2025 - 2025

Exelion IP PTY LTD

Developed a floor plan symbol extraction method based on object detection.
Annotated data for floor plan symbol evaluation and training.
Created a comprehensive analysis and report of further approaches, including smart feature extraction and overfitting.

Technologies: Computer Vision, Machine Learning, Amazon Web Services (AWS), Artificial Intelligence (AI), Retrieval-augmented Generation (RAG), Optical Character Recognition (OCR), Open-source LLMs, Visual Language Models (VLMs), Generative Artificial Intelligence (GenAI)

AI Engineer

2024 - 2025

Nokia - Bell Labs

Built a custom visual marker detection system and developed a specialized object detector with custom regression. The integrated, production-ready system outperformed all prior approaches, as confirmed by client feedback.
Developed a synthetic data generator for visual markers.
Developed an image-matching system that identifies objects across different scenes using a custom similarity function, resulting in a highly accurate and efficient solution with strong potential for real-world applications.

Technologies: Python, PyTorch, TensorFlow, Kubernetes, Computer Vision, Computer Vision Algorithms, Natural Language Processing (NLP), MongoDB, MinIO, Docker, Pandas, NumPy

Team Lead

2020 - 2023

Visage Technologies

Developed light sword and sun glare neural network-based detectors.
Created a small and efficient yet accurate deep-learning light source-detector.
Created a new and improved light source classifier.
Finished large-scale codebase handover successfully.
Managed the annotation process with the supplier and defined a new generation annotation structure for the light source mission.
Worked on next-generation advanced driver-assistance systems.

Technologies: C++17, C, Gerrit, Git, Computer Vision, Machine Learning, Deep Learning, Deep Neural Networks (DNNs), Artificial Intelligence (AI), Data Science

Research Engineer

2019 - 2020

Gideon Brothers

Developed a depth estimation neural network with stereo video input in TensorFlow. This project included research and development of multiple state-of-the-art architectures, from Monodepth2, Struct2depth, Fast Deep Stereo, and more.
Created the annotation web tool in Dash/Flask, used for depth annotations.
Implemented the neural network pipeline described in the Fast Deep Stereo with 2D Convolutional Processing of cost signatures paper using TensorFlow and OpenCV.

Technologies: Convolutional Neural Networks (CNNs), Git, Image Recognition, Neural Networks, Deep Neural Networks (DNNs), Image Segmentation, Artificial Intelligence (AI), OpenCV, Computer Vision, Deep Learning, Machine Learning, TensorFlow, Python, Docker, Dash, PyTorch, Jupyter

Research Engineer - OCR Specialist

2016 - 2019

Microblink

Developed an accurate and robust ID-1 card detector neural network that works in real-time on mobile phones, developed using TensorFlow and OpenCV.
Developed an extremely small and accurate TensorFlow implementation of the neural network for card analysis, used for immediate user feedback.
Built an annotation tool for detecting blur in Dash/Flask, participated in the annotation process, and developed a robust neural network classifier in TensorFlow.
Explored and developed a face action recognizer using TensorFlow and a Visage Technologies face detector.
Researched a Croatian ID card verification through detecting hologram using Caffe for training, and Python, OpenCV, and GIMP for data augmentation.

Technologies: Convolutional Neural Networks (CNNs), Git, Image Recognition, Neural Networks, Deep Neural Networks (DNNs), Image Segmentation, Flask, Artificial Intelligence (AI), OpenCV, Computer Vision, Deep Learning, Machine Learning, TensorFlow, Python, Object Detection, Docker, Dash, PyTorch, Optical Character Recognition (OCR), Jupyter, Data Science, PDF

Junior Software Engineer

2015 - 2015

Creative Fields

Created a plugin interface in the cfSuite desktop application in C++.
Developed custom plugin creator in C++ used for the desktop cfSuite application.
Automated application testing procedures, used for finding bugs after updates.

Technologies: Git, Qt, C++

Experience

Tag Detector

This project focused on enhancing the detection and classification of visual markers (Tags) used across industries like robotics, drones, and logistics. These markers can be difficult to identify reliably in real-world conditions, prompting the need for a more robust solution.

The project began with a detailed analysis of existing systems, evaluating their strengths and weaknesses. Based on this research, I developed a custom solution designed to outperform current approaches. Key components included a synthetic data generator that produced realistic, varied training samples and a custom object detector with tailored regression to improve localization.

I then integrated the full pipeline—data generation, detection, and classification—into a production-ready system. The client later confirmed that the new solution outperformed all previous implementations, delivering higher accuracy and reliability.

Object Matcher

This project involved building an image-matching system capable of identifying a specific object across various images, given a few reference examples. The goal was to develop a lightweight yet effective solution for matching objects in diverse scenes, which required both robust feature extraction and intelligent comparison methods.

The system's core leveraged fast and small neural networks for feature extraction due to its efficiency and solid performance on visual tasks. To match features between reference and candidate images, I implemented a custom feature-matching algorithm, allowing for fine-tuned control over match quality.

A key challenge was isolating the target object from cluttered or complex backgrounds. To solve this, I integrated a model similar to Meta’s Segment Anything model for object segmentation, which significantly improved the quality of the features extracted and, in turn, the overall matching accuracy.

The final system performed well and showed strong potential for future use in object tracking and visual search applications.

Badminton Shuttle Tracker

The project aimed to improve the accuracy of a shuttlecock tracking system used in a popular badminton analytics product. While the product was already award-winning and trusted by professional athletes and academies, the tracking system had plateaued at around 85% accuracy, limiting its potential for delivering high-quality analytics.

My role began with a detailed analysis of the existing models and dataset to understand where improvements were needed. Based on these insights, I developed a plan that included retraining detection and tracking models with targeted adjustments, as well as expanding the dataset to improve model robustness.

To further boost performance, I implemented a post-processing pipeline that corrected and smoothed detections across both spatial dimensions. This helped fill in gaps and reduce errors, particularly during fast-paced action. Working closely with the client’s internal team, we successfully built a new generation of shuttlecock trackers, resulting in a noticeable improvement in accuracy and overall reliability.

AI Expert for Poker Game App

Developed a real-time Poker AI using the Counterfactual Regret Minimization (CFR) algorithm. This project incorporated a machine learning model to improve decision-making beyond traditional CFR techniques. Worked on recreation of DeepStack approach.

Find Waldo Type AI Object Segmentation

Developed a versatile AI pipeline for object segmentation, inspired by the classic "Where's Waldo?" books. This pipeline utilizes deep learning techniques to identify and locate specific objects within an image automatically. Similar to how you search for Waldo in the busy illustrations, this system can be trained to detect various objects, regardless of background clutter or scene complexity.

Computer Vision Expert to Digitize Darts Game

Developed a real-time darts detection system using computer vision. This system analyzes video feeds from multiple cameras to identify dart trajectories and pinpoint their impact locations on the dartboard, enabling automated scoring and potentially offering game analysis features.

AI Expert for Healthcare Personal Assistant

Developed a novel algorithm for improving blood pressure estimation accuracy using smartphone cameras. This innovative approach leverages the power of computer vision to address limitations in existing mobile blood pressure monitoring solutions.

Full MVP Project

A custom object detector in a very specific environment and I was in charge of the process that involved the object detector model, including data gathering, setting up an annotation pipeline, managing the annotation process, developing the model, and exporting the model to the iOS app.

The client didn't have any data whatsoever. The first step was data scraping and data selection. Considering the data scraping was from Google images, there were many duplicates in the dataset. I created a small annotation tool to remove duplicates, then I defined annotation instructions and set up the annotation process. I led a team of three annotators. The model development and export to the iOS platform were the final steps.

What was particularly interesting was the timeline. The project lasted for two and a half months and was successfully delivered and well received on the demo with the end client.

Card Detector

As part of extracting information from ID-1 cards, the detector is needed and this project was intended to be the first step in extracting the card information pipeline where it's a general ID-1 card detector based on a neural network. It started as my master's thesis and grew to full-scale research.

The neural network needed to be fast enough to run on mid-range Android phones in real-time, under 1MB in size, and extremely accurate, and it needed to work on all ID-1 cards worldwide. It wasn't clear whether it was even possible.
Due to the project's strict restrictions, and since there wasn't any recipe, I started gradually. I started with meeting the accuracy goal, not worrying about size and inference speed so I could get a feeling for the problem, but also to see what accuracy is possible.

Many different approaches were explored, from detectors to segmentation. I got multiple solutions that satisfied all but one criterion. I often needed to question my assumptions. This led to several fresh starts during the project. TensorFlow and PyTorch were used for this detection problem alongside OpenCV for data augmentation.

In the end, the goal was achieved, and in mid-2019, the detector went into production.

Depth Estimation

Robust depth estimation from cameras is needed to push autonomous forklift robots further, and the Depth estimation is known to be quite a challenging machine-learning problem; the project aimed to find an accurate, robust, fast neural network-based solution for estimating depth. Traditional depth estimation algorithms work well in certain environments, but the industry has a much wider distribution.

Extremely hard ground truth gathering makes the problem even more complex.
A specific annotation tool was developed in Dash/Flask for a supervised approach and annotated about 40,000 images.

Several approaches were tried, from supervised to self-supervised methods. The self-supervised approach proved to be superior due to hard ground truth gathering. I managed to get accurate depth with great details from self-supervised architecture. Later, that output was used to train a smaller neural network for the final solution. TensorFlow and PyTorch were used for training, alongside OpenCV for image manipulation and Flask/Dash for an annotation web tool.

DeepCluster

https://github.com/samo1petar/deepcluster

The client needed a machine learning developer to help finish the MVP. The project involved adapting a deep cluster codebase to a customer-specific environment and running it on the specified dataset. The project can be seen in the URL link.

Shapes Detection MVP

A PowerPoint shape detector model, where I developed segmentation models that can detect various shapes. As a prerequisite for training the model, I developed the model training codebase that supports TensorFlow, PyTorch, and PyTorch Lightning libraries.

Blur Detector

The problem that was approached in this project, was to detect whether an image was sharp enough to detect letters; in other words, whether or not an image is blurry.

My first approach was to take sharp images, artificially blur some of them in OpenCV and train the classifier. I used median blur, average blur, gaussian blur, and motion blur.

Input images were downscaled to 128x128 pixels after blurring occurred. Once downscaled by the eye, blurred images were indistinguishable from nonblurred images.

The network had over 99.99% accuracy on the test set.

I decided to annotate images and create a realistic test set to be sure. I created an annotation tool in OpenCV and Python and organized the annotating process. About 33,000 images were annotated. I discovered that the network had significantly lower accuracy than the first test, at 80%. This was unexpected; it meant that the network found artificial blurring patterns, even when blurred parameters were randomly applied.

Training on real blurred images fixed the problem, and accuracy reached 97%.

Hologram Detector

The hologram detector is a Croatian ID validation project using neural networks. It's a binary classifier that says whether the Croatian ID card is valid or fake. The decision is made by detecting different patterns of the hologram. Since I had a card detector and the Croatian ID hologram was always in the same place, I focused on one part of the image.

I created a synthetic dataset as described below:

• All seven hologram patterns were photographed in high resolution.
• Holograms were drawn using GIMP.
• Very useful mug shots were used to create a realistic dataset using the GIMP Python shell and the magic wand tool. On the Croatian ID, the hologram overlaps with the face image. Faces were used to come as close to the original images as possible.
• Images were glued together by placing the hologram on top of the faces with random backgrounds.
• The noise was applied next.

Neural network classifiers were trained on synthetic images with Caffe and OpenCV. The network passed all video tests and proved the project was a success.

Education

2015 - 2017

Master's Degree in Computer Science

Faculty of Electrical Engineering and Computing, University of Zagreb - Zagreb, Croatia

2011 - 2015

Bachelor's Degree in Computer Science

Faculty of Electrical Engineering and Computing, University of Zagreb - Zagreb, Croatia

Skills

Libraries/APIs

OpenCV, PyTorch, TensorFlow, NumPy, TensorFlow Deep Learning Library (TFLearn), Pandas, OpenAI API, Beautiful Soup, Claude API, Google Calendar API, Gradio

Tools

Git, Jupyter, Visual Language Models (VLMs), Claude, Gerrit, Confluence, AI Prompts, SendGrid

Languages

Python, C++, C++17, C, SQL

Platforms

Amazon Web Services (AWS), Docker, Linux, Mobile, Kubernetes

Frameworks

Qt, Flask, LangGraph

Storage

MongoDB, Redis

Other

Convolutional Neural Networks (CNNs), Artificial Intelligence (AI), Image Segmentation, Deep Neural Networks (DNNs), Computer Vision, Machine Learning, Deep Learning, Object Detection, Optical Character Recognition (OCR), Computer Vision Algorithms, Minimum Viable Product (MVP), Data Scraping, OpenAI, OpenAI GPT-3 API, OpenAI GPT-4 API, Data Science, Edge Computing, Video Processing, Retrieval-augmented Generation (RAG), Large Language Models (LLMs), API Integration, Generative Artificial Intelligence (GenAI), AI Agents, APIs, Image Processing, Image Recognition, Neural Networks, Clustering, Annotation Processors, Point Clouds, Vector Databases, PDF, Image Generation, Robotics, Dash, Startups, Medical Applications, Signal Processing, Health, Models, Natural Language Processing (NLP), MinIO, Open-source LLMs, YOLOv5, DeepSORT, YOLOv8, ByteTrack, OpenAI SDK, FAISS, FastAPI, WebSockets, Anthropic, LangChain, Hugging Face, Document Processing

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring