Ujjwal Agrawal, Developer in Varanasi, Uttar Pradesh, India
Ujjwal is available for hire
Hire Ujjwal

Ujjwal Agrawal

Verified Expert  in Engineering

Machine Learning Developer

Location
Varanasi, Uttar Pradesh, India
Toptal Member Since
January 14, 2022

Ujjwal is a seasoned lead machine learning architect with 4+ years of journey marked by reshaping credit underwriting for prominent Indian financial institutions. His adeptness at constructing end-to-end ML/AI solutions has left a transformative impact. Ujjwal's accomplishments include pioneering NLP pipelines, credit risk, and driving innovation through an AutoML SaaS product. His technical expertise includes Python, ML, data science, and full-stack development.

Portfolio

Monsoon CreditTech Pvt
Agile Sprints, Team Management, Technical Hiring, Research, GPT...
SYNDIKAT7 GmbH
Artificial Intelligence (AI), OCR, Azure Machine Learning...
Monsoon CreditTech Pvt
Python 3, Data Science, Data Scraping, Agile Sprints, Machine Learning, Pandas...

Experience

Availability

Part-time

Preferred Environment

Python 3, OpenAI GPT-3 API, Data Science, Generative Pre-trained Transformers (GPT), Product Management, Agile Sprints, Azure, Amazon S3 (AWS S3), Amazon EC2, Azure Cognitive Services

The most amazing...

...code architecture change I've made is able to bring down the existing latency of the feature creation script from 20 seconds to two seconds.

Work Experience

Lead ML Engineer

2021 - PRESENT
Monsoon CreditTech Pvt
  • Worked on creating an end-to-end NLP pipeline to consume the SMS data of a person for credit underwriting. This pipeline includes custom text classification and named-entity recognition (NER) models. Improved the existing model performance significantly.
  • Worked on creating an end-to-end NLP-based pipeline to consume users' bank-statement data. Upgraded the existing text classification model by improving the model architecture and algorithm used. Improved the existing model performance by 5 AUC points.
  • Handled two active client projects simultaneously and managed a team of six people. Conducted several one-on-one sessions for team building and brown bag sessions to enhance the team's skills.
  • Worked on building an end-to-end ML SaaS product, overlooked the software architecture, created several code styling rules, and refactored the code. Managed three developers. Heavily contributed to the architecture designing.
  • Architected extremally important services like notifications and resource allocation for ML modeling pipelines for a SaaS ML product.
  • Worked extensively on the code review process to ensure the quality and correctness of the code. Also had several performance review sessions to give timely feedback to team members.
  • Conducted the technical hiring and took more than 100 interviews. Hired a team of 10+ data scientists and ML engineers. Created several screening tests for quality hiring. Also worked heavily on team building and onboarding exercises.
Technologies: Agile Sprints, Team Management, Technical Hiring, Research, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Machine Learning, Python 3, Data Science, Product Management, SaaS, Credit Risk, Credit Underwriting, Cloud Computing, Machine Learning Operations (MLOps), Docker, Kubernetes, NGINX, Django, Python, Data Analytics, Data Visualization, Artificial Intelligence (AI), Visual Studio Code (VS Code), Linux, Windows, Jupyter Notebook, Exploratory Data Analysis, Pandas, NumPy, uWSGI, Matplotlib, Seaborn, Scikit-learn, Azure, Google Cloud Platform (GCP), Storytelling, IT Project Management, Fintech, Regression, Data Analysis, Software Engineering, REST APIs, Amazon Web Services (AWS), Regex, Data Handling, Data Wrangling, Team Building, Software Design Patterns, Software Architecture, Git, GitHub, JSON, Source Code Review, Code Review, Interviewing, Task Analysis, APIs, Architecture, Web Development, Leadership, XGBoost, Azure Machine Learning, Project Management, Statistics, Decision Trees, Back-end Development, Algorithms, Data Matching, CSV File Processing, GitLab

AI Expert

2023 - 2023
SYNDIKAT7 GmbH
  • Contributed to several OCR techniques to extract data from PDF. Created a well-managed architecture using PyPDF2 to extract data from complex layouts like degree certificates in five different formats.
  • Leveraged the OpenAI API and, with prompt engineering, was able to extract crucial information like university, degree, subject-wise marks, grade, etc., along with the candidate's personal information.
  • Took charge of the entire REST API-based back-end architecture, leveraging the power of FastAPI for async calls.
  • Incorporated the Celery and RabbitMQ server for the background process scheduling to handle the CPU load intelligently.
  • Integrated the Cosmos DB and Azure Blob storage to store the processed and unprocessed data. This helped reduce the OpenAI API cost per operation.
  • Deployed the entire architecture on an SSL-enabled NGINX server powering the Uvicorn server for REST API calls.
  • Managed the entire MLOps and DevOps during the development and deployment cycle single-handedly, resulting in fast delivery of features and a fail-proof system.
Technologies: Artificial Intelligence (AI), OCR, Azure Machine Learning, Artificial Intelligence as a Service (AIaaS), Large Language Models (LLMs), Prompt Engineering, FastAPI, Celery, RabbitMQ, Azure Cosmos DB, Azure Blob Storage API, NGINX, SSL Certificates, Algorithms, Data Matching, GitLab

Machine Learning Engineer

2019 - 2021
Monsoon CreditTech Pvt
  • Worked on risk scorecards, reducing the delinquency rate by 30%, increasing the approval rate by 20% for a fintech client using credit underwriting models, and achieving an increasing profit margin for the clients by 20%.
  • Worked on a regression problem to predict customer income and achieved a MAPE of 16% in the INR 20,000-50,000 bucket.
  • Deployed a complex model reducing the compute time from 15 seconds to two seconds by making architectural changes in the code.
  • Extensively worked with data from several credit bureaus in India and gained a solid understanding of users' financial data. Created risk scorecards and collection scorecards based on such data sources.
  • Cut down the deployment creation time after validation on machine learning models from an average of 20 days to an average of 8-9 days by automating several steps.
Technologies: Python 3, Data Science, Data Scraping, Agile Sprints, Machine Learning, Pandas, NumPy, Scikit-learn, Azure, Google Cloud Platform (GCP), Docker, Kubernetes, Django, Exploratory Data Analysis, Storytelling, uWSGI, Team Management, IT Project Management, Fintech, Data Analysis, Python, Machine Learning Operations (MLOps), Software Engineering, REST APIs, Data Analytics, Data Visualization, Artificial Intelligence (AI), Visual Studio Code (VS Code), Linux, Windows, Jupyter Notebook, Matplotlib, Seaborn, Regression, Credit Risk, GPT, Natural Language Processing (NLP), Amazon Web Services (AWS), Regex, Data Handling, Data Wrangling, Team Building, Software Design Patterns, Software Architecture, Credit Underwriting, Cloud Computing, Git, GitHub, JSON, Source Code Review, Code Review, Task Analysis, APIs, Architecture, Web Development, XGBoost, Project Management, Statistics, Decision Trees, Back-end Development, Algorithms, Data Matching, CSV File Processing

Asscociate Data Scientist

2019 - 2019
Celebal Technologies
  • Built a resume parser to extract specific information from a resume, such as a college or a university name, years of experience, courses, skills, previous jobs, etc.
  • Built a speech transcription system to create transcripts of the conversation in meetings. Used azure cognitive APIs and made a robust ML pipeline to take a recording, convert the speech to text, and make a transcription.
  • Built a sequence-aware content recommendation system to evolve customers' journey from a researcher phase to a buyer phase and make the user journey more engaging. Consumed events data generated by the user and could predict within 0.1 seconds.
  • Built robust Nginx server pipelines for a machine learning model to support API hits up to 50 requests per second. Deployed the model on eight core VMs with 16 GB RAM and a latency of 0.1 seconds.
Technologies: Azure, Amazon S3 (AWS S3), Amazon Web Services (AWS), Apache Kafka, MongoDB, PyMongo, Python 3, Flask, Azure Cognitive Services, Amazon EC2, uWSGI, NGINX, Regex, GPT, Natural Language Processing (NLP), Machine Learning, Data Science, Data Scraping, Data Handling, Data Wrangling, SQL, NoSQL, Python, Data Analytics, Data Visualization, Artificial Intelligence (AI), Visual Studio Code (VS Code), Linux, Windows, Jupyter Notebook, Android, Java, Exploratory Data Analysis, Pandas, NumPy, Agile Sprints, Matplotlib, Seaborn, Scikit-learn, Storytelling, Data Analysis, Software Engineering, REST APIs, Team Building, Software Design Patterns, Software Architecture, Cloud Computing, Git, GitHub, JSON, Task Analysis, APIs, Speech Recognition, Back-end Development, Algorithms, Data Matching, CSV File Processing

Resume and Degree Information Extractor Using OCR and LLM

The project aimed to develop a versatile system capable of handling resumes and degree certificates in any format, extracting essential information for a company receiving thousands of resumes. A proof of concept (POC) was created to demonstrate the automation capabilities in resume shortlisting using machine learning.

The key components I worked on included:

• FastAPI: Employed for handling asynchronous API calls efficiently.

• PyPDF2: Used for OCR to extract information from PDF documents.

• Azure Cognitive Service (Form Recognizer): Implemented for extracting information from complex degree certificate layouts.

• OpenAI GPT-3.5 API: Utilized for extracting and classifying crucial information
into tags.

• Azure Blob and Cosmos DB: Used for storing raw and processed data.

• Celery and RabbitMQ server: Employed for scheduling background processes.

• SSL-activated NGINX server: Implemented to ensure a secure server architecture.

• Git and Azure DevOps: Utilized for hosting the code repository and
managing infrastructure.

• Azure Poetry: Employed for requirements and dependency management.

• Vault Services: Used to store credentials securely.

Income Estimation | Regression Problem

The objective was to predict the income of users given their credit bureau data.

I was able to achieve 16% MAPE in the segment of a 20,000-50,000 INR income segment. The overall MAPE was 21% on the out-of-time dataset.

This model is currently deployed and being used for policy building and FOIR calculation.

Risk ScoreCard | Classification Problem

The major problem in the fintech industry dealing with lending is NPAs. The goal of this project was to predict the probability of default in the next 9-12 months.

I built numerous scorecards based on different data sources like bureau, banking, and SMS data.

Achievements:
• Achieved the 80 AUC or 0.60 GINI on the out-of-time dataset
• Brought down the existing delinquency rate by 30%, keeping the approval rate intact. I also reduced the delinquency by 50% by decreasing the current approval rate by 10%.
• Deployed seven of such models built on several different data sources.
• Met API latency needs with modifying models without hardly impacting the model's performance.

Collections ScoreCard | Classification Problem

The overall objective was to predict the probability of EMI bounce in the next three months with robust EDA and a flag creation process to predict an out-of-time dataset probability with 92 AUC. This model is currently deployed and used every month.

Content Recommendation Engine

The challenge was to build the content recommendation engine for India's biggest Autoportal company. Achieved the CTR with the current and productionized model of 4.5% with an average response time of 0.1 seconds.
Built an end-to-end Kafka and Mongo DB-based pipeline to continuously process the event data.

Languages

Python 3, Python, SQL, Java, Regex

Libraries/APIs

Pandas, Matplotlib, NumPy, Scikit-learn, REST APIs, XGBoost, PyMongo, Azure Cognitive Services, Azure Blob Storage API, Azure Computer Vision API

Tools

Seaborn, Git, GitHub, uWSGI, NGINX, Azure Machine Learning, GitLab, Celery, RabbitMQ

Paradigms

Data Science

Platforms

Jupyter Notebook, Visual Studio Code (VS Code), Linux, Windows, Azure, Google Cloud Platform (GCP), Docker, Software Design Patterns, Android, Apache Kafka, Kubernetes, Amazon Web Services (AWS), Amazon EC2

Storage

JSON, MongoDB, Amazon S3 (AWS S3), NoSQL, Azure Cosmos DB

Industry Expertise

Project Management

Other

Exploratory Data Analysis, Agile Sprints, Storytelling, Fintech, Data Analysis, Software Engineering, Data Handling, Technical Hiring, Credit Underwriting, Data Analytics, Data Visualization, Artificial Intelligence (AI), Source Code Review, Code Review, Interviewing, Task Analysis, APIs, Decision Trees, Back-end Development, Algorithms, Data Matching, CSV File Processing, Data Scraping, Machine Learning, Team Management, IT Project Management, Regression, Credit Risk, Natural Language Processing (NLP), Machine Learning Operations (MLOps), Data Wrangling, Team Building, Research, Software Architecture, Product Management, SaaS, Cloud Computing, Architecture, Web Development, Leadership, GPT, Generative Pre-trained Transformers (GPT), Statistics, Speech Recognition, OpenAI GPT-3 API, OCR, Artificial Intelligence as a Service (AIaaS), Large Language Models (LLMs), Prompt Engineering, FastAPI, SSL Certificates, Azure Form Recognizer

Frameworks

Flask, Django

2015 - 2019

Bachelor's Degree in Information Technology

Arya College of Engineering & IT - Jaipur, India

DECEMBER 2018 - PRESENT

Data Analytics Using Python

Edugrad

SEPTEMBER 2018 - PRESENT

Git and GitHub

Udemy

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring