Verified Expert in Engineering
Machine Learning Developer
Ujjwal is a seasoned lead machine learning architect with 4+ years of journey marked by reshaping credit underwriting for prominent Indian financial institutions. His adeptness at constructing end-to-end ML/AI solutions has left a transformative impact. Ujjwal's accomplishments include pioneering NLP pipelines, credit risk, and driving innovation through an AutoML SaaS product. His technical expertise includes Python, ML, data science, and full-stack development.
Python 3, OpenAI GPT-3 API, Data Science, Generative Pre-trained Transformers (GPT), Product Management, Agile Sprints, Azure, Amazon S3 (AWS S3), Amazon EC2, Azure Cognitive Services
The most amazing...
...code architecture change I've made is able to bring down the existing latency of the feature creation script from 20 seconds to two seconds.
Lead ML Engineer
Monsoon CreditTech Pvt
- Worked on creating an end-to-end NLP pipeline to consume the SMS data of a person for credit underwriting. This pipeline includes custom text classification and named-entity recognition (NER) models. Improved the existing model performance significantly.
- Worked on creating an end-to-end NLP-based pipeline to consume users' bank-statement data. Upgraded the existing text classification model by improving the model architecture and algorithm used. Improved the existing model performance by 5 AUC points.
- Handled two active client projects simultaneously and managed a team of six people. Conducted several one-on-one sessions for team building and brown bag sessions to enhance the team's skills.
- Worked on building an end-to-end ML SaaS product, overlooked the software architecture, created several code styling rules, and refactored the code. Managed three developers. Heavily contributed to the architecture designing.
- Architected extremally important services like notifications and resource allocation for ML modeling pipelines for a SaaS ML product.
- Worked extensively on the code review process to ensure the quality and correctness of the code. Also had several performance review sessions to give timely feedback to team members.
- Conducted the technical hiring and took more than 100 interviews. Hired a team of 10+ data scientists and ML engineers. Created several screening tests for quality hiring. Also worked heavily on team building and onboarding exercises.
- Contributed to several OCR techniques to extract data from PDF. Created a well-managed architecture using PyPDF2 to extract data from complex layouts like degree certificates in five different formats.
- Leveraged the OpenAI API and, with prompt engineering, was able to extract crucial information like university, degree, subject-wise marks, grade, etc., along with the candidate's personal information.
- Took charge of the entire REST API-based back-end architecture, leveraging the power of FastAPI for async calls.
- Incorporated the Celery and RabbitMQ server for the background process scheduling to handle the CPU load intelligently.
- Integrated the Cosmos DB and Azure Blob storage to store the processed and unprocessed data. This helped reduce the OpenAI API cost per operation.
- Deployed the entire architecture on an SSL-enabled NGINX server powering the Uvicorn server for REST API calls.
- Managed the entire MLOps and DevOps during the development and deployment cycle single-handedly, resulting in fast delivery of features and a fail-proof system.
Machine Learning Engineer
Monsoon CreditTech Pvt
- Worked on risk scorecards, reducing the delinquency rate by 30%, increasing the approval rate by 20% for a fintech client using credit underwriting models, and achieving an increasing profit margin for the clients by 20%.
- Worked on a regression problem to predict customer income and achieved a MAPE of 16% in the INR 20,000-50,000 bucket.
- Deployed a complex model reducing the compute time from 15 seconds to two seconds by making architectural changes in the code.
- Extensively worked with data from several credit bureaus in India and gained a solid understanding of users' financial data. Created risk scorecards and collection scorecards based on such data sources.
- Cut down the deployment creation time after validation on machine learning models from an average of 20 days to an average of 8-9 days by automating several steps.
Asscociate Data Scientist
- Built a resume parser to extract specific information from a resume, such as a college or a university name, years of experience, courses, skills, previous jobs, etc.
- Built a speech transcription system to create transcripts of the conversation in meetings. Used azure cognitive APIs and made a robust ML pipeline to take a recording, convert the speech to text, and make a transcription.
- Built a sequence-aware content recommendation system to evolve customers' journey from a researcher phase to a buyer phase and make the user journey more engaging. Consumed events data generated by the user and could predict within 0.1 seconds.
- Built robust Nginx server pipelines for a machine learning model to support API hits up to 50 requests per second. Deployed the model on eight core VMs with 16 GB RAM and a latency of 0.1 seconds.
Resume and Degree Information Extractor Using OCR and LLM
The key components I worked on included:
• FastAPI: Employed for handling asynchronous API calls efficiently.
• PyPDF2: Used for OCR to extract information from PDF documents.
• Azure Cognitive Service (Form Recognizer): Implemented for extracting information from complex degree certificate layouts.
• OpenAI GPT-3.5 API: Utilized for extracting and classifying crucial information
• Azure Blob and Cosmos DB: Used for storing raw and processed data.
• Celery and RabbitMQ server: Employed for scheduling background processes.
• SSL-activated NGINX server: Implemented to ensure a secure server architecture.
• Git and Azure DevOps: Utilized for hosting the code repository and
• Azure Poetry: Employed for requirements and dependency management.
• Vault Services: Used to store credentials securely.
Income Estimation | Regression Problem
I was able to achieve 16% MAPE in the segment of a 20,000-50,000 INR income segment. The overall MAPE was 21% on the out-of-time dataset.
This model is currently deployed and being used for policy building and FOIR calculation.
Risk ScoreCard | Classification Problem
I built numerous scorecards based on different data sources like bureau, banking, and SMS data.
• Achieved the 80 AUC or 0.60 GINI on the out-of-time dataset
• Brought down the existing delinquency rate by 30%, keeping the approval rate intact. I also reduced the delinquency by 50% by decreasing the current approval rate by 10%.
• Deployed seven of such models built on several different data sources.
• Met API latency needs with modifying models without hardly impacting the model's performance.
Collections ScoreCard | Classification Problem
Content Recommendation Engine
Built an end-to-end Kafka and Mongo DB-based pipeline to continuously process the event data.
Python 3, Python, SQL, Java, Regex
Pandas, Matplotlib, NumPy, Scikit-learn, REST APIs, XGBoost, PyMongo, Azure Cognitive Services, Azure Blob Storage API, Azure Computer Vision API
Seaborn, Git, GitHub, uWSGI, NGINX, Azure Machine Learning, GitLab, Celery, RabbitMQ
Jupyter Notebook, Visual Studio Code (VS Code), Linux, Windows, Azure, Google Cloud Platform (GCP), Docker, Software Design Patterns, Android, Apache Kafka, Kubernetes, Amazon Web Services (AWS), Amazon EC2
JSON, MongoDB, Amazon S3 (AWS S3), NoSQL, Azure Cosmos DB
Exploratory Data Analysis, Agile Sprints, Storytelling, Fintech, Data Analysis, Software Engineering, Data Handling, Technical Hiring, Credit Underwriting, Data Analytics, Data Visualization, Artificial Intelligence (AI), Source Code Review, Code Review, Interviewing, Task Analysis, APIs, Decision Trees, Back-end Development, Algorithms, Data Matching, CSV File Processing, Data Scraping, Machine Learning, Team Management, IT Project Management, Regression, Credit Risk, Natural Language Processing (NLP), Machine Learning Operations (MLOps), Data Wrangling, Team Building, Research, Software Architecture, Product Management, SaaS, Cloud Computing, Architecture, Web Development, Leadership, GPT, Generative Pre-trained Transformers (GPT), Statistics, Speech Recognition, OpenAI GPT-3 API, OCR, Artificial Intelligence as a Service (AIaaS), Large Language Models (LLMs), Prompt Engineering, FastAPI, SSL Certificates, Azure Form Recognisor
Bachelor's Degree in Information Technology
Arya College of Engineering & IT - Jaipur, India
Data Analytics Using Python
Git and GitHub
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.Start hiring