Stefan is available for hire

Stefan Mićić

Verified Expert in Engineering

Python Data Engineer and Developer

Novi Sad, Vojvodina, Serbia

Toptal member since July 20, 2022

Expertise

Machine Learning Artificial Intelligence Neural Network Deep Learning Data Engineering Computer Vision NLP Code Review LLM Chatbot Development OpenAI Python

Bio

Stefan is an experienced machine learning and machine learning operations (MLOps) engineer with hands-on experience in big data systems. His almost a decade of expertise is supplemented by a master's degree in artificial intelligence. Stefan has worked on problems such as object detection, classification, sentiment analysis, named-entity recognition (NER), and recommendation systems. He is always looking forward to being involved in end-to-end machine learning projects.

Portfolio

Stealth Startup

Python, Amazon Web Services (AWS), FastAPI, Pydantic, Amazon Bedrock...

Notabene

Python, AI Agents, FastAPI, OpenAI API, Anthropic, Pydantic, Amazon Bedrock...

Lattify LLC

Data Engineering, Python, Large Language Models (LLMs), LangChain, PostgreSQL...

Experience

Python 3 - 10 years
Machine Learning - 10 years
Natural Language Processing (NLP) - 5 years
Computer Vision - 5 years
Deep Learning - 5 years
Data Science - 5 years
Keras - 5 years
Spark - 4 years

Preferred Environment

PyCharm, Python 3, Python, GitHub, Amazon S3 (AWS S3), JSON, Distributed Systems

The most amazing...

...end-to-end machine learning solution I've created optimized the cost of the machine learning pipelines numerous times with state-of-the-art results.

Work Experience

ML Engineer

2025 - 2025

Stealth Startup

Designed an AI-driven adaptive learning system that automatically adjusts pace and content delivery based on student comprehension levels.
Implemented multimodal content delivery (images, videos, and games) with voice interaction using ElevenLabs for audio-based learning.
Built a production-ready MVP from scratch in 12 weeks on an AWS infrastructure (EC2, ECS, SageMaker, and Lambda).

Technologies: Python, Amazon Web Services (AWS), FastAPI, Pydantic, Amazon Bedrock, AI Automation

AI Lead

2025 - 2025

Notabene

Architected and deployed a complex multi-agent AI infrastructure on AWS, combining GNNs, LLMs, and traditional ML models on EKS.
Built data pipelines using SageMaker to transform raw data into graph structures for GNN training and extract insights from news articles.
Delivered the alpha version from scratch in 10 weeks, establishing the foundation for blockchain compliance analysis.

Technologies: Python, AI Agents, FastAPI, OpenAI API, Anthropic, Pydantic, Amazon Bedrock, AI Architecture, AI Automation

Senior Data/AI Engineer

2025 - 2025

Lattify LLC

Refactored the whole solution and migrated from one tool to another, making the application more robust.
Led the development of the whole back end and AI agentic system.
Designed the infrastructure of the AI services using AWS.

Technologies: Data Engineering, Python, Large Language Models (LLMs), LangChain, PostgreSQL, MongoDB, Data Pipelines, Artificial Intelligence (AI), Agentic AI, FastAPI, Anthropic, Pydantic, Amazon Bedrock, AI Automation

Data Engineer

2024 - 2025

Pfizer - GC - PGS

Reduced ETL pipeline costs by 2x by implementing proper transaction handling in Snowflake-to-PostgreSQL data flows, directly improving operational efficiency.
Built and optimized numerous ETL pipelines extracting data from Snowflake, applying transformations, and ingesting into PostgreSQL—handling the full data lifecycle.
Performed data analysis and gathered business requirements for ETL pipelines, bridging technical implementation with stakeholder needs.

Technologies: Data Engineering, ETL, Python, Amazon Web Services (AWS), Snowflake, Amazon Bedrock, AI Automation

Machine Learning Tech Lead

2024 - 2025

Provectus

Led a team of eight ML engineers to build an enterprise document summarization platform, achieving 50% faster processing while maintaining accuracy.
Summarized 200+ page insurance reports using LLMs and NLP techniques with high precision at $1 per document.
Established engineering excellence through thorough code reviews, CI/CD pipelines, and mentorship on RAG and vector database implementation.

Technologies: Python, AI Agents, Large Language Models (LLMs), Amazon Web Services (AWS), FastAPI, OpenAI API, Pydantic, Amazon Bedrock

AI Lead

2024 - 2025

Prozone

Designed, developed, and deployed solutions from scratch.
Worked closely with non-technical stakeholders, got the requirements, did a feasibility check, and did the planning and estimation.
Achieved over 95% accuracy on AI Law Assistant via advanced RAG.

Technologies: Agentic AI, OpenAI, AWS Lambda, Amazon Web Services (AWS), Gemini, REST APIs, Docker, Python, FastAPI, OpenAI API, Pydantic, Amazon Bedrock

Senior MLOps Engineer

2023 - 2024

PlusPower

Developed big ML pipelines using Sagemaker including preprocessing, training, evaluation and deployment.
Developed pipeline that was able to generate airflow pipelines based on configs and automated deployment of DAGs.
Increased test coverage from 15% to 80% and added integration tests so that we can test sagemaker pipelines locally.

Technologies: Python 3, Amazon SageMaker, Amazon Web Services (AWS), Docker, Bitbucket, DocumentDB, Grafana, Datadog, Terragrunt, Apache Airflow, Pytest, Terraform, Infrastructure, Back-end Development, FastAPI, Pydantic, Amazon Bedrock

Machine Learning Engineer

2023 - 2024

RhythmScience Inc.

Deidentified the database and various types of files (HL7, XML, and PDF) by HIPAA standards, and dockerized and automated the whole pipeline.
Developed ML algorithms to generate text and classify PDF reports.
Designed, implemented, and deployed the solution using Docker and AWS.

Technologies: Machine Learning, Python, Keras, PyTorch, Deep Learning, Scikit-learn, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Data Integration, Intelligent Content Processing (ICP), Back-end Development, FastAPI, Pydantic

AI Lead (via Toptal)

2023 - 2024

Cumulus Technologies LLC

Created the whole CI/CD pipeline on AWS. Everything from data ingestion, processing, and model training to model deployment was automated.
Designed and led the implementation of the whole ML pipeline using various AWS services such as Lambda, Polly, and SageMaker.
Utilized AWS for development to meet high security requirements (AWS Cloud9, AWS CodeCommit, and AWS CodePipeline).
Used prompt engineering to force the model to find answers from different sources.

Technologies: Artificial Intelligence (AI), Machine Learning, Python, Amazon Web Services (AWS), Amazon SageMaker, Machine Learning Operations (MLOps), Hyperledger Fabric, Google Cloud Platform (GCP), SQL, PostgreSQL, Database Migration, Large Language Models (LLMs), Models, Unit Testing, English, Generative Artificial Intelligence (GenAI), Language Models, Stock Trading, Algorithmic Trading, Finance, Financial Software, Trading Systems, OpenAI, Prompt Engineering, Retrieval-augmented Generation (RAG), OpenAI GPT-3 API, OpenAI GPT-4 API, APIs, Speech Recognition, System Architecture, Infrastructure, Google Cloud, ChatGPT, Back-end Development, ChatGPT API, AI Chatbots, Chatbots, OpenAI Assistants API, OpenAI API, Pydantic, Amazon Bedrock

MLOps Engineer

2023 - 2023

NewsCorp

Performed the deployment of different LLM and Stable Diffusion models.
Worked on latency and cost optimizations of LLMs. Successfully reduced latency by five times using different deployment techniques.
Took responsibility for the complete deployment process of the whole ML part and documentation maintenance.
Used prompt engineering to make LLM execute the NER tasks.
Used RAG approach in couple of chatbots.

Technologies: Amazon EC2, GitHub, Docker, Deep Learning, Models, Unit Testing, English, Query Optimization, Language Models, Retrieval-augmented Generation (RAG), APIs, Infrastructure, Back-end Development, ChatGPT API, Pydantic, Image Generation, Text to Image, Text to Image AI

MLOps Engineer

2022 - 2023

PepsiCo Global - DPS

Implemented an end-to-end machine learning pipeline using PySpark.
Set up CI/CD workflows with unit and integration tests using GitHub Actions.
Developed Spark and scikit-learn/Pandas ETL jobs to process large data volumes (150 TB).

Technologies: Machine Learning Operations (MLOps), APIs, Machine Learning, Python, Databricks, Big Data, Spark, Scikit-learn, Pandas, CI/CD Pipelines, REST APIs, ETL, Models, Unit Testing, Data Processing, English, Query Optimization, MLflow, Data Analytics, Infrastructure, Back-end Development, Pydantic

Tech Lead Data Engineer

2022 - 2023

Motius

Led a small team in implementing an ELT pipeline to get data from a GraphQL database and put it into Azure SQL. Everything was Dockerized and pushed to Azure Image Registry.
Implemented KPI calculations using PySpark, which was communicating with Snowflake. Defined table schema for Snowflake and created migration scripts.
Followed the Scrum methodology, including daily scrums, retro, and planning, and used Jira.
Led a small team in implementing ETL Spark jobs with Apache Airflow as an orchestrator, AWS as infra and Snowflake as a data warehouse.

Technologies: Spark, Apache Spark, PySpark, Snowflake, Python, Python 3, Amazon Web Services (AWS), Databases, Distributed Systems, Azure SQL, Azure, AWS Glue, Apache Airflow, Software Architecture, Data Pipelines, Data Analysis, CI/CD Pipelines, Database Migration, Data Engineering, ETL, Unit Testing, Data Processing, English, Query Optimization, Data Analytics, Data Integration, ELT, DataOps, Back-end Development, ChatGPT API

MLOps Engineer

2021 - 2022

Lifebit

Carried out deep learning model optimizations using quantization, ONNX Runtime, and pruning, among others.
Monitored model performance, including memory, latency, and CPU usage.
Used Valohai to automate the CI/CD process and GitHub Actions to automate some parts of the MLOps lifecycle.
Created automated experiment tracking using Amazon CloudWatch, Valohai, Python, GitHub Actions, and Kubernetes.

Technologies: Amazon EC2, Valohai, Keras, TensorFlow, Python 3, Lens Studio, Kubernetes, Codeship, GitHub, Open Neural Network Exchange (ONNX), Visual Studio Code (VS Code), Optimization, Neural Networks, NumPy, Monitoring, Amazon S3 (AWS S3), Cloud, Scikit-learn, Amazon Web Services (AWS), AI Design, Deep Neural Networks (DNNs), Software Engineering, Pytest, JSON, Source Code Review, Code Review, Task Analysis, Databases, Data Science, CI/CD Pipelines, DevOps, REST APIs, Models, Unit Testing, English, Language Models, APIs, Amazon SageMaker, Terraform, Celery, Infrastructure, Ray.io

Machine Learning Engineer

2020 - 2021

HTEC Group

Optimized a machine learning compiler already on a trained network without re-training using Open Neural Network Exchange (ONNX) and implemented custom operators using PyTorch and C++.
Worked on an Android machine learning solution and mentored a less experienced developer to train and prepare an object detector and classifier to run smoothly on an Android device.
Enhanced a project that aimed to upscale images to be as perfect as possible toward 4K resolution.
Involved in SDP of ship routing problem. Implemented an algorithm from scratch that will guide the ships. Fuel consumption and ETA were used for calculations.
Worked on open source ONNX Runtime in order to add support for the MIGraphX library.

Technologies: Python 3, Python, Docker, Computer Vision, PyTorch, Artificial Intelligence (AI), Machine Learning, Team Leadership, Machine Learning Operations (MLOps), GitHub, Convolutional Neural Networks (CNNs), Open Neural Network Exchange (ONNX), Visual Studio Code (VS Code), Neural Networks, NumPy, Cloud, Pandas, Scikit-learn, Computer Vision Algorithms, AI Design, Deep Neural Networks (DNNs), Software Engineering, Pytest, JSON, Technical Hiring, Source Code Review, Code Review, Task Analysis, Interviewing, Databases, Data Science, REST APIs, Models, Unit Testing, English, Language Models, Research, APIs

Machine Learning Engineer

2019 - 2020

SmartCat

Contributed to complete MLOps lifecycles using MLflow for model versioning, LakeFS for data versioning, AWS S3 for data storage, and TensorFlow serving in Docker.
Functioned as a data engineer using Apache Spark for ETL jobs with Prefect and Apache Airflow for scheduling.
Trained several different architectures for object detection and classification.

Technologies: Python 3, Scala, Python, Docker, SQL, Computer Vision, MongoDB, Artificial Intelligence (AI), Machine Learning, Data Engineering, Machine Learning Operations (MLOps), GitHub, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), ETL, Visual Studio Code (VS Code), Neural Networks, NumPy, Amazon S3 (AWS S3), Big Data, Image Processing, Cloud, Pandas, Scikit-learn, Object Detection, Computer Vision Algorithms, Object Tracking, Apache Spark, Amazon Web Services (AWS), AI Design, Deep Neural Networks (DNNs), Software Engineering, Pytest, ETL Tools, JSON, Jupyter Notebook, Source Code Review, Code Review, Task Analysis, PySpark, Databases, Data Science, Distributed Systems, Data Pipelines, REST APIs, Models, Unit Testing, Data Processing, English, MLflow, APIs, Amazon SageMaker, Prefect

MLOps Engineer

2018 - 2019

Financial Times

Deployed numerous LLM and Stable Diffusion models through SageMaker and CloudFormation with production-grade infrastructure.
Reduced model latency by 5x through deployment optimizations and inference techniques.
Owned the complete ML deployment process, including autoscaling configuration, cost optimization, and documentation maintenance.

Technologies: Amazon SageMaker, Python, Grafana

Machine Learning Engineer

2016 - 2019

Freelance

Scraped product information from various websites, then analyzed and prepared the scraped data for web shops using natural language processing—long short-term memory (LSTM), Word2Vec, and transformers—and added NER since the data was in Serbian.
Used Amazon SageMaker to automate the machine learning pipeline—data preprocessing, model training, and deployment. Executed automated retraining and deployment of the model, completing the machine learning process before the client updated new data.
Worked on big data projects using Apache Spark, Kafka, Hadoop, and MongoDB.
Worked as a data engineer using Spark to create optimized ETL pipelines. Translated the client's needs into SQL.

Technologies: Python 3, Spark, Amazon SageMaker, Python, Docker, Computer Vision, MongoDB, Artificial Intelligence (AI), Machine Learning, Data Engineering, Kubernetes, Machine Learning Operations (MLOps), GitHub, Amazon EC2, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), Open Neural Network Exchange (ONNX), Recommendation Systems, Natural Language Understanding (NLU), Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Visual Studio Code (VS Code), Time Series, Data Modeling, Data Mining, Neural Networks, NumPy, Amazon S3 (AWS S3), Big Data, Apache Kafka, Hugging Face, Transformers, Cloud, Pandas, Scikit-learn, Object Detection, Computer Vision Algorithms, Apache Spark, Amazon Web Services (AWS), AI Design, Web Development, Deep Neural Networks (DNNs), Software Engineering, Pytest, JSON, Jupyter Notebook, Source Code Review, Code Review, Task Analysis, PySpark, Databases, Data Science, Distributed Systems, Project Management, CI/CD Pipelines, Google Cloud Platform (GCP), DevOps, REST APIs, Models, Unit Testing, English, MLflow, APIs, ChatGPT, AI Agents, LangChain, Agentic AI, Cursor AI

Experience

Automated End-to-end (E2E) Computer Vision Solution

Created a system that performed several things in real-time, including:
• Detecting objects in the room
• Classifying person poses
• Automated re-training (active learning)
• Model and data versioning
• Dockerized pipeline
Using those models and predictions, we created a post-processing pipeline for creating reports or key performance indicators (KPIs) for clients.

Android COVID-19 Test Classification

The goal was to create a COVID-19 test classification model. We had a small dataset and had to build the best model in the shortest possible time (two weeks).
I led a team of two people on this project. We used MobileNet due to size, and all business-relevant metrics were great. We used many optimization techniques to deploy the model to Android, such as quantization, pruning, and knowledge distillation.

MLOps Engineer

Participated in a project where my job was to optimize the whole machine learning system using quantization, pruning, ONNX, and more. I achieved the same accuracy with five times reduced latency, two times reduced model size, and four times reduced cost. I also changed the type of underlying EC2 instances to get more of our system.

Image Super Resolution

The goal was to improve the model for upscaling and super-resolution by researching and developing approaches from SOTA research papers. There were a lot of different custom loss function, layers, metrics, and even custom back propagations.

ETL Jobs

• Created batch ETL jobs for calculating KPIs.
• Optimized solution to reduce cost and calculation time.
• Scheduled jobs via Airflow and Prefect.
The tech stack was: Spark, Scala, AWS S3, Kafka, Apache Airflow, and Prefect.

NLP Articles Processing

The goal of this project was to develop two stages of article processing:
1. Find all relevant tags (events, locations, names, etc.) in the article.
2. Find pairs of tags that are somehow related.

Hugging Face transformers were mainly used to tackle this problem (BERT-based models). Overall metrics were above 95%.

Data Ingestion

Led a team whose goal was to get data from the GraphQL database and insert it into Azure SQL. Everything was Dockerized and pushed to EKS on every push to the main branch on GitLab. Concurrent threads were used in order to optimize the solution.

Tech Leadership for the DE project

My responsibility was to make all decisions from architectural to the nitty gritty details about the implementation. We used AWS for infra (CloudWatch, Glue, S3) and Airflow to orchestrate Spark jobs. Every result of a Spark job was saved to Snowflake.

Financial AI Assistant

I was involved in this startup from day 0.

TASKS
• Architected and deployed a production AWS infrastructure for aiime.com, an AI personal assistant available on the App Store.
• Built an end-to-end MLOps pipeline supporting continuous model iteration, automated testing, and zero-downtime deployments following industry best practices.
• Engineered cost-optimized RAG-based chatbot with real-time data retrieval capabilities, balancing performance and cloud spend through serverless architecture and intelligent caching strategies.
• Implemented vector database integration and LLM orchestration for contextual, accurate responses.
• Led technical mentorship and documentation efforts, establishing system design standards and knowledge-sharing practices.
• Created comprehensive deployment runbooks and architecture diagrams, enabling team autonomy.

KEY ACHIEVEMENTS
Production deployment on the Apple App Store, cost-efficient and scalable infrastructure, and comprehensive MLOps automation.

Law AI Assistant

I was involved in this startup from day 0. I architected and deployed a production AWS infrastructure for Law AI Assistant. It provides answers and all relevant resources (laws, court cases, opinions, etc.) by utilizing RAG (Pinecone vector DB) and LLMs and a multiple-stage pipeline. It achieved over 90% overall accuracy of the answers. Layers were able to get sources used for the answer, apart from the answer itself.

KEY ACHIEVEMENTS
Implemented a production-ready system from scratch in 4 months.

Blockchain Compliance AI Assistant

I was involved in this startup from day 0. I developed a complex infrastructure on AWS containing multiple AI agents and data pipelines. AI Agents consisted of GNNs, LLMs, NLP models, and other traditional ML models. They were all deployed to EKS. Data Pipelines were built using SageMaker pipelines, and they were used to transform raw data into a graph for GNN training, extract data from news and articles, and more.

KEY ACHIEVEMENTS
Alpha version built from scratch in 10 weeks.

Adaptive AI Tutor

I was involved in this startup from day 0. I architected and deployed a production AWS infrastructure for Adaptive AI Tutor, which can tailor lessons to students' needs. For example, the tutor will automatically adapt the pace according to how well the student understands the content and present an image, video, or game according to the student's preference. Audio mode is implemented so that students can talk to the application and receive knowledge through voice as well.

KEY ACHIEVEMENTS
Implemented an MVP from scratch in 12 weeks.

Document Summarizer

I led a team of eight (including mid and senior ML engineers) to build an enterprise document summarization platform using LLMs, RAG architecture, and NLP on AWS infrastructure.

TASKS
• Achieved 50% faster document processing while maintaining accuracy through optimized retrieval pipelines and prompt engineering.
• Established engineering excellence practices: conducted thorough merge request reviews that improved code quality and reduced deployment failures.
• Collaborated with management on project scoping and estimation, consistently delivering on time and within budget.
• Drove technical decisions on architecture design, model selection, and infrastructure optimization.
• Mentored engineers on ML best practices, vector database implementation, and scalable system design.
• Built CI/CD pipelines for reliable model deployment and monitoring.

KEY ACHIEVEMENTS
Summarized an insurance report with over 200 pages using LLMs and NLP techniques with high precision and 1$ per document.

Education

2020 - 2021

Master's Degree in Artificial Intelligence

University of Novi Sad - Novi Sad, Serbia

Certifications

JULY 2022 - JULY 2025

AWS Certified Machine Learning - Specialty

Amazon Web Services

Skills

Libraries/APIs

PyTorch, Keras, NumPy, Scikit-learn, REST APIs, OpenAI API, Hugging Face Transformers, TensorFlow, Pandas, PySpark, OpenAI Assistants API, Pydantic, vLLM, Terragrunt

Tools

PyCharm, Amazon SageMaker, GitHub, Apache Airflow, Pytest, ChatGPT, Open Neural Network Exchange (ONNX), Codeship, Prefect, AWS Glue, Bitbucket, Grafana, Terraform, Celery, Intelligent Content Processing (ICP), AI Prompts

Languages

Python 3, Python, SQL, Scala, Java, Snowflake, GraphQL, C++

Frameworks

Spark, Apache Spark, Streamlit

Paradigms

ETL, Unit Testing, DevOps

Platforms

Amazon Web Services (AWS), Jupyter Notebook, Visual Studio Code (VS Code), Docker, Kubernetes, Amazon EC2, Valohai, Apache Kafka, Azure, Databricks, Google Cloud Platform (GCP), Hyperledger Fabric, Kubeflow, AWS Lambda, Blockchain

Storage

Amazon S3 (AWS S3), JSON, Databases, PostgreSQL, NoSQL, MongoDB, Data Pipelines, Database Migration, Data Integration, Azure SQL, Datadog, Google Cloud

Industry Expertise

Trading Systems, Project Management

Other

Deep Learning, Machine Learning, Data Science, Artificial Intelligence (AI), Data Engineering, Computer Vision, Natural Language Processing (NLP), Natural Language Understanding (NLU), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Machine Learning Operations (MLOps), Neural Networks, AI Design, Deep Neural Networks (DNNs), Software Engineering, Technical Hiring, Source Code Review, Code Review, Task Analysis, Interviewing, APIs, Generative Pre-trained Transformers (GPT), Large Language Models (LLMs), Models, Data Processing, English, Generative Artificial Intelligence (GenAI), Language Models, MLflow, OpenAI, ChatGPT API, AI Chatbots, Chatbots, Anthropic, AI Automation, Recommendation Systems, Lens Studio, Optimization, Team Leadership, Time Series, Data Modeling, Data Mining, Monitoring, Big Data, Image Processing, Transformers, Cloud, Object Detection, Computer Vision Algorithms, Object Tracking, Web Development, Speech Recognition, Voice Recognition, Cloud Services, ETL Tools, Distributed Systems, Data Analysis, CI/CD Pipelines, Query Optimization, Research, Stock Trading, Algorithmic Trading, Finance, Financial Software, Prompt Engineering, Retrieval-augmented Generation (RAG), OpenAI GPT-3 API, OpenAI GPT-4 API, Data Analytics, ELT, System Architecture, Infrastructure, DataOps, FastAPI, Large Language Model Operations (LLMOps), Back-end Development, AI Agents, LangChain, Agentic AI, Cursor AI, Amazon Bedrock, Image Generation, Text to Image, Text to Image AI, AI Architecture, Graphics Processing Unit (GPU), Hugging Face, BERT, Back-end, Software Architecture, DocumentDB, Ray.io, AI Translation, Gemini, Cryptocurrency, ElevenLabs Solutions, Leadership, NVIDIA TensorRT

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring