Muhammad Khubaib Raza, Developer in Lahore, Punjab, Pakistan
Muhammad is available for hire
Hire Muhammad

Muhammad Khubaib Raza

Verified Expert  in Engineering

Machine Learning Engineer and Developer

Location
Lahore, Punjab, Pakistan
Toptal Member Since
October 13, 2022

Khubaib is a full-stack machine learning engineer who specializes in natural language processing (NLP). He's a problem solver with over six years of experience developing end-to-end machine learning systems using NLP and MLOps and scaling them to run in production environments. He has worked on a wide range of projects involving chatbot development, artificial intelligence, and NLP. Khubaib holds a master's degree in data science covering NLP, machine learning, and deep learning.

Portfolio

Metric
Python 3, Python API, System Architecture, MongoDB, Firebase, Mixpanel...
Self-employed
Python 3, PyTorch, Scikit-learn, Seaborn, Pandas, Transformers, Flask, FastAPI...
matix
Chatbots, Generative Pre-trained Transformers (GPT)...

Experience

Availability

Part-time

Preferred Environment

Python 3, Transformers, PyTorch, Pandas, PySpark, Scikit-learn, Generative Pre-trained Transformers (GPT), AutoML, ChatGPT, OpenAI

The most amazing...

...thing I've achieved is participating in NextGrid's GPT-3 Hackathon and finishing in 3rd place.

Work Experience

Head of Engineering

2021 - PRESENT
Metric
  • Shaped the product vision from a non-technical position, including alignment for product-market fit.
  • Managed a team of back-end, front-end, and Android developers using the Kanban methodology to deliver the work.
  • Contributed to the product's technical and active development.
  • Installed the application on 40,000+ devices. It's used by people from 30 different business sectors. The customer uses this application to track business insights.
  • Conducted website crawling and data scraping to assist in company research and marketing analytics.
Technologies: Python 3, Python API, System Architecture, MongoDB, Firebase, Mixpanel, BigQuery, Google Cloud Platform (GCP), OCR

Machine Learning Engineer

2019 - PRESENT
Self-employed
  • Designed, prototyped, developed, and deployed (Docker) systems based on machine learning models, especially in the NLP field.
  • Conducted information extraction from raw data, such as PDF documents.
  • Created crawlers for downloading data from various sources.
  • Worked on named entity recognition in medical data, performing manual annotation on the data.
Technologies: Python 3, PyTorch, Scikit-learn, Seaborn, Pandas, Transformers, Flask, FastAPI, Scripting, CI/CD Pipelines, Amazon SageMaker, Amazon EC2, Amazon Elastic Container Registry (ECR), Amazon Elastic Container Service (Amazon ECS), AWS Lambda, Amazon Cognito

Senior NLP Engineer

2020 - 2021
matix
  • Worked on different NLP projects, with a focus on classification and information extraction from documents.
  • Implemented back-end services that use Asterisk and Twilio.
  • Used Vosk API and Google Speech-to-Text (STT) to stream speech data and convert it into text.
Technologies: Chatbots, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Amazon Web Services (AWS), Google Cloud, Twilio, Twilio API, Sockets, Custom Solutions, API Development, Google Speech-to-Text API, Transformer Models

Automatic Description Generation From Images

Working on this project was a great learning experience, ranging from room item detection similar to Airbnb amenity detection to text generation (table to text).

The apartment photo database includes the price of the apartment, floor, latitude, longitude, nearby places, and whether it has parking. First, we had to eliminate duplicate photos from one apartment, so we would not detect items repeatedly. In the second step, the photos are then classified into room types. In the third step, count the number of bedrooms, baths, and other rooms, as well as detect the amenities. The fourth step is to create a dataset for description generation, and last but not least, to generate the description from the transformer model.

ChatGPT Powered Voice-based Customer Support

A great learning experience that involved integrating ChatGPT API and Twilio to develop an automated voice-based customer support system.

The system's functionality enables customers to call a business's Twilio phone number and ask a query, which ChatGPT then processes to provide personalized responses based on the customer's input. The ChatGPT model has been trained to understand and respond to a wide range of customer queries.

Information Extraction From Invoices

Developed an OCR (Optical Character Recognition) system using the Paddle OCR framework to recognize and extract text from invoices accurately. The SageMaker Pipeline was used to deploy the system, which enabled continuous model training and deployment with built-in model monitoring and version control. The system was implemented to automate invoice processing for a small business, significantly reducing manual data entry time.
2019 - 2021

Master's Degree in Data Science

Information Technology University - Lahore, Pakistan

2015 - 2019

Bachelor's Degree in Computer Science

Government College University - Lahore, Pakistan

JULY 2021 - PRESENT

Machine Learning in Production

Coursera

AUGUST 2020 - PRESENT

AWS Machine Learning Engineer Scholarship Program

Udacity

MAY 2020 - PRESENT

Getting Started with AWS Machine Learning

Coursera

OCTOBER 2019 - PRESENT

Deep Learning Nanodegree

Udacity

AUGUST 2018 - PRESENT

Introduction to Data Science in Python

Coursera

Libraries/APIs

Node.js, Google Speech-to-Text API, Python API, Pandas, Matplotlib, Twilio API, Sockets, API Development, PyTorch, Scikit-learn, PySpark, Rasa NLU

Tools

Amazon SageMaker, Amazon Elastic Container Registry (ECR), Amazon Elastic Container Service (Amazon ECS), BigQuery, Seaborn, Amazon Cognito, Amazon Textract, AutoML, ChatGPT

Languages

Python 3, Python

Frameworks

Flask

Storage

Databases, Amazon S3 (AWS S3), Google Cloud, MongoDB, Amazon DynamoDB

Paradigms

Data Science, ETL, Object-oriented Programming (OOP)

Platforms

Azure, Amazon Web Services (AWS), AWS Lambda, Amazon EC2, Twilio, Firebase, Mixpanel, Google Cloud Platform (GCP), Docker

Other

Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Web Development, Machine Learning Operations (MLOps), Web App Deployment, OpenAI, OpenAI GPT-4 API, OpenAI GPT-3 API, Machine Learning, Deep Learning, Big Data, Information Retrieval, Model Development, Chatbots, Data Structures, Algorithms, Data Warehousing, Custom Solutions, Android App Design, Model Deployment, Generative Pre-trained Transformer 3 (GPT-3), Computer Vision, Classification, Amazon API Gateway, System Architecture, Transformers, FastAPI, Scripting, CI/CD Pipelines, Artificial Intelligence (AI), OCR, Transformer Models

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring