Muhammad Talha Zubair, Developer in Islamabad, Islamabad Capital Territory, Pakistan
Muhammad is available for hire
Hire Muhammad

Muhammad Talha Zubair

Verified Expert  in Engineering

Machine Learning Developer

Islamabad, Islamabad Capital Territory, Pakistan
Toptal Member Since
June 6, 2023

Muhammad is an accomplished machine learning developer with four years of extensive experience in data science, machine learning, and computer vision, making substantial contributions to several high-profile projects. His versatile tech stack includes expertise in data analysis, proficiency in machine learning and deep learning technologies, expertise in DevOps, and mastery of cloud technologies. Furthermore, Muhammad's competencies extend to Python and API development using Flask and NGINX.


Amazon EC2, Amazon S3 (AWS S3), Deep Neural Networks, Variational Autoencoders...
Computer Vision, Deep Learning, Machine Learning, Slack, TensorFlow, PyTorch...
Computer Vision, Linux, Deep Learning, Docker, AWS CLI, MySQL, NGINX, Flask...




Preferred Environment

Slack, Jira, Windows, Linux

The most amazing...

...project I've developed is an advanced evaluation framework for assessing the quality of Stable Diffusion-generated images.

Work Experience

Machine Learning Engineer

2022 - PRESENT
  • Developed an algorithm using the computer vision gaze estimation technique for attention estimation of attendees in classrooms, seminars, etc.
  • Worked on crowd analytics of images and videos from the Arab world and trained and tested image recognition models for person detection using crowd human datasets.
  • Developed an evaluation script for person reidentification using OpenCV, PyTorch, and Python.
  • Dockerized training and testing pipelines deployed over Cloud and Nvidia DGX.
  • Conducted data analysis using PostgreSQL and Pandas for different applications.
  • Used the MLOps tool (WandB) for experimentation tracking and the automation of the training and testing pipeline.
Technologies: Amazon EC2, Amazon S3 (AWS S3), Deep Neural Networks, Variational Autoencoders, Speech to Text, Speech Recognition, Computer Vision, NVIDIA CUDA, Generative Adversarial Networks (GANs), Data Analysis, OpenCV, Docker, Kubernetes, Artificial Intelligence (AI), Machine Learning Operations (MLOps), Google Cloud Platform (GCP), AWS CLI, Amazon SageMaker, Fine-tuning, Hugging Face, Deep Learning, Graphics Processing Unit (GPU), Videos, FastAPI, Containerization

Associate Data Scientist / ML Engineer

2021 - PRESENT
  • Developed and trained computer vision algorithms—YOLOv5, DeepLab, and StarDist—using PyTorch and TensorFlow for various applications, including road segmentation, bacteria detection, and multispecies analysis.
  • Automated training and testing processes by creating Bash scripts, Docker pipelines, and MLOps pipelines using tools like WandB.
  • Utilized Amazon EC2 instances, S3 buckets, and Azure DevOps for cloud computing, data storage, version control, reporting, and requirements handling.
  • Implemented computer vision algorithms for driver behavior analysis, including lane change detection, hard braking detection, and road surface analysis.
  • Converted models to different formats using Core ML for seamless integration with iOS devices.
  • Built custom data generators and used techniques like k-means clustering, PCA, and ResNet-50 for dataset processing and noise removal.
  • Ensured code quality and standardization by introducing GitHub pre-commit hooks and GitHub Actions.
  • Conducted data collection and analysis using PostgreSQL and Pandas for insights and decision-making.
Technologies: Computer Vision, Deep Learning, Machine Learning, Slack, TensorFlow, PyTorch, MATLAB, Scikit-learn, Jira, NumPy, OpenCV, Docker, Pandas, Azure DevOps, SQL, PostgreSQL, AWS CLI, GitHub, GitHub Actions, Data Analysis, Image Processing, Convolutional Neural Networks, Python, Language Models, Azure Machine Learning, Image Analysis, Artificial Intelligence (AI), Graphics Processing Unit (GPU), Videos, Medical Diagnostics, FastAPI, Containerization

Machine Learning Engineer

2020 - 2021
  • Worked on both service-based and product-based streams, delivering facial recognition and vehicle detection and tracking solutions, respectively.
  • Developed facial recognition solutions using algorithms like Haar cascades, MTCNN, and FaceNet in TensorFlow and deployed them on web platforms.
  • Designed an object detection and tracking solution for thermal imagery, specifically handling occlusion, to be deployed on NVIDIA Jetson TK1. Translated different methods into Cython and Numba to enhance real-time performance.
  • Built and tested a feature-based tracking method using a normalized cross-correlation (NCC) template matching algorithm, SIFT feature selection, and the Kalman Filter.
  • Trained a custom dataset using PyTorch YOLOv3-tiny object detection model and integrated algorithms with the front end using Flask and NGINX.
  • Managed a team of four, bridging the gap between hardware and software teams, and guided the annotation team for accurate object annotation in various videos.
Technologies: Computer Vision, Linux, Deep Learning, Docker, AWS CLI, MySQL, NGINX, Flask, MATLAB, OpenCV, NumPy, Facial Recognition, OCR, Computer Vision Algorithms, Artificial Intelligence (AI), Videos

AI/ML Engineer

2019 - 2020
  • Worked on Madhunt, an augmented reality game inspired by Pokemon GO, incorporating real-time object detection using YOLO and TensorFlow word2vec for finding related elements.
  • Designed a deep reinforcement learning-based recommendation algorithm tailored specifically for custom users playing the game using Python.
  • Developed reward functions within the reinforcement learning framework and seamlessly integrated the algorithms into the existing game structure.
  • Handled queries from Firebase and AWS using Python using Firebase SDK and the Boto library.
  • Implemented a YOLO image recognition algorithm for object detection within the game, analyzing images captured during gameplay.
  • Conducted research and development on state-of-the-art recommendation systems based on reinforcement learning techniques. Explored existing recommendation systems based on machine learning algorithms, particularly collaborative filtering systems.
Technologies: Deep Learning, Firebase, Atlassian SDK, You Only Look Once (YOLO), BERT, Natural Language Processing (NLP), Data Analysis, Computer Vision, Mobile Vision, Artificial Intelligence (AI)

Pathogen Detection in Microscopic Imagery
As an image recognition, computer vision, and deep learning researcher, I significantly contributed to infrastructure development, notably developing algorithms to detect harmful cells in microscopic chicken imagery. I retrained the StarDist segmentation model with Python and TensorFlow for multispecies detection and built a custom Keras data generator for efficient dataset loading.

I utilized OpenCV, k-means, PCA, and ResNet-50 for noise removal and PostgreSQL and Pandas for data analysis. I also implemented an MLOps pipeline with wandB to facilitate seamless dataset and model tracking, with GitHub pre-commit hooks ensuring code formatting.

Prioritizing scalability, I deployed code using Docker, leveraged Azure DevOps for version control and management, and optimized Linux pipelines with bash scripts. I adhered to best practices in code optimization and version control using Git and Azure DevOps and implemented CI/CD and cron jobs. Finally, I managed Docker containers with Kubernetes and tested solutions via Flask APIs.

Dash Cam Analytics Platform

I successfully trained the YOLOv5 computer vision algorithm using PyTorch. This training utilized custom datasets across various GPU environments, including Amazon EC2 instances. Subsequently, I converted the Torch YOLOv5 model to the Core ML format, enabling efficient inference on iOS platforms. DeepLab was instrumental in segmenting different road parts, with image up-sampling techniques deployed via cv2 superRes.

I developed Bash scripts and Docker pipelines, which automated significant parts of the training and testing processes. I collected data using PostgreSQL, with comprehensive data analysis conducted through Pandas. I worked extensively with Amazon EC2 and S3, ensuring efficient deployment and storage, and implemented unit tests to ensure seamless integration with GitHub Actions. Code optimization was a priority, and I maintained version control using Git, ensuring high quality and collaboration standards. Finally, I rigorously tested the Flask API for back-end solutions to ensure reliable performance and functionality.

Vehicle Detection and Tracking over Thermal Imagery

I spearheaded the design and development of an object detection and tracking solution for thermal imagery, emphasizing occlusion handling and deployment on the NVIDIA Jetson TK1. The project involved training PyTorch's YOLOv3-tiny model on a custom dataset and creating a Python-based feature-tracking pipeline using normalized cross-correlation (NCC), SIFT feature selection, and the Kalman Filter for optimal estimation. Analyzing our custom detector and tracker results offered valuable system performance insights.

As a team leader, I promoted collaboration between hardware and software teams, managed a four-person team, and ensured accurate object annotation guidance for various videos.

The Python solution was effectively deployed on NVIDIA Jetson TK1. We used OpenCV for pipeline development, Numpy for efficient array operations, and improved performance with Cython and Numba for real-time capabilities. Azure DevOps was our choice for code versioning, reporting, and requirement management, and we employed Linux frameworks and bash scripts for efficient pipelines.

Recommendation System Based on Reinforcement Learning
I developed a deep reinforcement learning-based recommendation algorithm for custom users on the augmented reality game Mad Hunt using Python. I designed and deployed reward functions in reinforcement learning using TensorFlow within the existing game structure, and handled queries from Firebase and AWS platforms in Python using Firebase SDK and the Boto library.

I implemented the YOLO image recognition algorithm to facilitate real-time object detection during gameplay while researching and developing cutting-edge recommendation systems based on reinforcement learning. Exploring existing recommendation systems led to a focus on machine learning algorithms, such as collaborative filtering systems.

I also designed and implemented facial recognition solutions using algorithms like Haar cascades, MTCNN, and FaceNet in TensorFlow, deploying these on web platforms. I used Flask and NGINX for efficient web hosting and server-side functionality and Azure DevOps for code versioning, reporting, and requirements management.

To ensure the highest quality standards, I prioritized code optimization and version control using Git. Finally, I handled CRUD operations on a MySQL database to manage daily user interactions with the application.

Gaze Estimation in Real Time Images

In this project, I focused on researching and developing existing deep-learning solutions for gaze estimation. This process involved training a computer vision gaze detection solution using a custom dataset developed in PyTorch. To keep track of our experiments, I used the MLOps tool WandB, which allowed us to manage and monitor our machine learning operations effectively. All model training and testing were conducted using Google Cloud Platform (GCP) instances, which provided us with scalable compute resources to ensure efficient performance and reliable results.

Custom Optical Character Recognition Systems

In this project, I developed and implemented natural language processing (NLP) algorithms and models to enhance optical character recognition (OCR) systems, resulting in an impressive 20% improvement in text extraction accuracy. This success was achieved through a fruitful collaboration with a cross-functional team of software engineers and data scientists. Together, we optimized OCR workflows, achieving a 30% reduction in processing time.

I conducted a comprehensive analysis of text data and bolstered OCR accuracy by implementing data preprocessing techniques like tokenization, stemming, and lemmatization. Further enriching our OCR systems, I integrated advanced NLP techniques, such as named entity recognition (NER) and sentiment analysis, allowing us to extract more valuable information from documents.

To continually improve OCR performance, particularly on challenging document types, I trained and evaluated deep learning models, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs).

Evaluation Metrics for Stable Diffusion Generated Images

I developed an advanced evaluation framework for assessing the quality of stable diffusion-generated images. This involved leveraging Detectron2 and a supervised graph approach, specifically incorporating a relative size graph to accurately determine the realism of objects within the images.

Developing a Speaker Diarization Algorithm Using LSTM


• Researching and implementing an LSTM-based speaker diarization algorithm for the call center analytics application.
• Preprocessing the audio data, including feature extraction and normalization.
• Testing and tuning the algorithm to optimize its performance and accuracy.
• Creating a user-friendly interface for the output of the algorithm.

I successfully implemented an LSTM-based speaker diarization algorithm that achieved 85% accuracy on a large customer call center recordings dataset. This algorithm was integrated into the call center analytics application, enabling our clients to understand their customers' needs better and improve their overall performance.


SQL, Python 3, Bash, Python


TensorFlow, PyTorch, OpenCV, Pandas, Keras, Scikit-learn, NumPy, LSTM, Flask-RESTful


You Only Look Once (YOLO), GitHub, Atlassian SDK, Azure Machine Learning, Slack, Jira, MATLAB, AWS CLI, NGINX, Amazon SageMaker


Docker, Linux, Amazon EC2, Google Cloud Platform (GCP), Firebase, Kubernetes, NVIDIA CUDA


Machine Learning, Computer Vision, Facial Recognition, Fine-tuning, Deep Learning, GitHub Actions, OCR, BERT, Natural Language Processing (NLP), Mobile Vision, Data Analytics, Image Recognition, NVIDIA Jetson TK1, Open Neural Network Exchange (ONNX), Reinforcement Learning, Deep Reinforcement Learning, Research, Machine Learning Operations (MLOps), Image Processing, Computer Vision Algorithms, Convolutional Neural Networks, Artificial Intelligence (AI), Language Models, Hugging Face, Videos, Graphics Processing Unit (GPU), Image Analysis, Frameworks, Object Detection, Medical Diagnostics, FastAPI, Containerization, Motion Tracking, Large Language Models (LLMs), Software Development, Networks, Data Analysis, Machine Vision, Generative Adversarial Networks (GANs), Stable Diffusion, Instance Segmentation, Deep Neural Networks, Variational Autoencoders, Speech to Text, Speech Recognition, Speaker Identification (SI), Speaker Diarization, Audio Streaming, Architecture,, Security




Azure DevOps, DevOps, Test-driven Development (TDD), Unit Testing


PostgreSQL, MySQL, Amazon S3 (AWS S3)

2015 - 2019

Bachelor's Degree in Computer Science

National University of Science and Technology - Islamabad, Pakistan


Neural Network and Deep Learning



Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning

DeepLearning.AI | via Coursera


SQL for Data Science



Machine Learning

Stanford University | via Coursera

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring