Surbhi Gupta, Developer in Jalpaiguri, West Bengal, India
Surbhi is available for hire
Hire Surbhi

Surbhi Gupta

Verified Expert  in Engineering

Data Scientist and Machine Learning Developer

Jalpaiguri, West Bengal, India
Toptal Member Since
November 2, 2021

Surbhi, previously a CTO at a GenAI startup and assistant professor at MUJ, is a generative AI, ML, and NLP expert with 5+ years of experience. She has designed and developed ML-based end-to-end solutions for startups at Toptal and Fortune 500 clients at Utopia. Her expertise includes ML, deep learning, NLP, computer vision, LLMs, GPT, AI, MLOps, and AWS. Surbhi solved problems in EAM, marketing, finance, chatbot, and crypto industries. She published research in robotics and optimization.


Artificial Intelligence (AI), Large Language Models (LLMs)...
Freelance Client
Amazon Web Services (AWS), AWS Amplify, Artificial Intelligence (AI)...
Alec Beglarian
Artificial Intelligence (AI), Deep Learning, OpenAI, ChatGPT, JavaScript, React...




Preferred Environment

Python, TensorFlow, Scikit-learn, OpenCV, Hugging Face, OpenAI, PyTorch, Amazon Web Services (AWS), Generative Pre-trained Transformers (GPT)

The most amazing...

...generative AI solution I've developed interacts with users to identify their brand's purpose and generates BVP and marketing content with text and images.

Work Experience

LLM Expert

2023 - 2024
  • Developed an innovative application to automatically curate and update newsletters. Leveraged LLMs to rewrite news articles, ensuring that the content resonated with the unique characteristics and interests of the target audience.
  • Used AI to generate dynamic weather reports for cities nationwide. Taking daily weather forecasts as input, the application wrote interesting weather reports tailored to each city's specific climate conditions.
  • Orchestrated the development of a robust back-end infrastructure hosted on AWS, utilizing a diverse array of services including Step Functions, EventBridge Scheduler, Bedrock, DynamoDB, Amplify, and Lambda functions.
  • Implemented automated workflow processes using AWS Step Functions, enabling efficient content aggregation, transformation, and distribution.
  • Integrated Google Programmable Search Engine API, News APIs, and Google Maps API into the application to augment its functionality and provide users with comprehensive and up-to-date information.
Technologies: Artificial Intelligence (AI), Large Language Models (LLMs), Amazon Web Services (AWS), ChatGPT API, OpenAI, Amazon Bedrock, Python, Data Scraping, OpenAI GPT-4 API, Open-source LLMs

Co-founder and CTO

2022 - 2023
Freelance Client
  • Secured significant investment capital for a company in the SAFE round-through, effective investor engagement, and a clear explanation of the technology.
  • Developed the first version of the product, meeting all key functionality requirements.
  • Led a technical due diligence process assessed by a renowned AI company and investors.
  • Spearheaded the establishment of a talented team through effective interviewing and assessment methods.
  • Utilized OpenAI models, effective prompt-engineering strategies, and few-shot learning to pioneer AI-led conversations with human users. Extracted valuable insights from these interactions and generated impactful marketing propositions.
  • Designed a feedback mechanism to collect training data from field experts for fine-tuning the models.
Technologies: Amazon Web Services (AWS), AWS Amplify, Artificial Intelligence (AI), OpenAI GPT-3 API, User Feedback, Few-shot Learning, Vue, TypeScript, Large Language Models (LLMs)

GPT-3 Expert

2022 - 2023
Alec Beglarian
  • Developed an MVP for email generation using OpenAI GPT-3 APIs.
  • Developed the MVP on the AWS cloud platform, which is integrated with a database, storage, Lambda functions, Amazon SES, etc.
  • Fine-tuned the OpenAI model for data correction to be used for email generation.
Technologies: Artificial Intelligence (AI), Deep Learning, OpenAI, ChatGPT, JavaScript, React, Generative Pre-trained Transformer 3 (GPT-3), Large Language Models (LLMs), Fine-tuning

ML Developer

2022 - 2022
SimpliCapital LLC
  • Deployed machine learning models with AWS cloud platform, using services like lambda functions, Amazon SageMaker, Amazon SNS, Amazon S3, etc.
  • Improved machine learning model performance for prediction of finance data.
  • Created Amazon SageMaker training and inference pipelines for ML models.
Technologies: Python, Jupyter, Amazon Web Services (AWS), Machine Learning Operations (MLOps), Machine Learning, Data Science, Node.js, Amazon SageMaker

AI Specialist | NLP Python Developer

2022 - 2022
Toptal Client
  • Improved the NLP solution to identify business prospects in financial data.
  • Used BERT-based POS tagging to extract important features from large documents.
  • Made the solution interpretable by identifying words used to mark sentences relevant to business prospects.
  • Used a Hugging Face transformer model for semantic similarity analysis.
Technologies: Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPU Computing, Python, Semantics, Transformers, PyTorch, Large Language Models (LLMs), Retrieval-augmented Generation (RAG), Open-source LLMs

AI Specialist | Python and ML Developer

2022 - 2022
  • Used OpenAI for solving Q&A and document query problems for a chatbot.
  • Provided an open-source alternative to the OpenAI document query solution with better accuracy.
  • Identified groups of chat clusters using agglomerative clustering based on semantic similarity.
Technologies: Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), OpenAI, Transformers, Clustering, Scikit-learn, Beautiful Soup, Web Scraping, Deep Learning, GPU Computing, Generative Pre-trained Transformer 3 (GPT-3), Data Analysis, Sequence Models, PyTorch, Large Language Models (LLMs), Retrieval-augmented Generation (RAG), Embeddings from Language Models (ELMo), Open-source LLMs

AI Specialist | Python and ML Developer

2021 - 2022
  • Performed stance detection and topic modeling on social media data, using unsupervised and semi-supervised methods.
  • Fine-tuned a pre-trained seq2seq transformer model for custom summarization tasks.
  • Used NLP performance evaluation metrics like BERTscore and ROUGE score for NLP tasks and achieved a score of 0.89 BERTscore on the summarization task.
Technologies: Machine Learning, Python, Scikit-learn, PyTorch, Transformers, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Transfer Learning, Sequence Models, Data Science, Graphics Processing Unit (GPU), BERT, Named-entity Recognition (NER), Entity Extraction, Matplotlib, LSTM, Classification, TensorFlow, Pandas, Artificial Intelligence (AI), Deep Learning, Time Series Analysis, Time Series, Data Analysis, GPU Computing, Clustering, Cloud, Large Language Models (LLMs), Fine-tuning, Embeddings from Language Models (ELMo)

Senior Data Science Engineer

2017 - 2021
  • Developed an end-to-end machine learning solution for information extraction from scanned documents and diagrams that brought a good deal with a Fortune 100 company. Deployed the project to the client as a cloud application.
  • Built a solution to identify equipment classes from descriptions and tags, which was used to deliver services to several clients.
  • Created a solution that enables identifying valid values from product descriptions in material master data used to deliver services to various clients.
  • Developed a solution to identify different shapes, tables, and text in diagram images. This required the application of several computer vision, image processing, deep learning, and machine learning techniques.
Technologies: Machine Learning, Deep Learning, Computer Vision, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Python, TensorFlow, Git, Streamlit, OCR, Scikit-learn, Image Processing, OpenCV, Convolutional Neural Networks (CNN), NumPy, Amazon S3 (AWS S3), AWS Lambda, Tesseract, You Only Look Once (YOLO), Object Detection, Transfer Learning, Text Detection, Keras, Amazon Web Services (AWS), PyTorch, Cloud Deployment, Machine Learning Operations (MLOps), Named-entity Recognition (NER), Entity Extraction, SQL, Matplotlib, Algorithms, LSTM, Classification, Image Recognition, Pandas, Artificial Intelligence (AI), Data Analysis, Team Leadership, GPU Computing, Clustering, Cloud, Machine Vision, Code Review, Source Code Review, Technical Hiring, Interviewing, Data Scraping, PDF Scraping, Embeddings from Language Models (ELMo)

Assistant Professor

2017 - 2017
Manipal University Jaipur
  • Lectured subjects like robotics and mechatronics system design, including topics on machine learning, artificial intelligence, and sensors.
  • Conducted laboratory experiments to give students hands-on experience on MATLAB, control systems, and sensors.
  • Conducted term papers and online quizzes and evaluated the performance of the students.
Technologies: Robotics, Machine Learning, Mechatronics, University Teaching, Matplotlib, Algorithms, Scikit-learn, Code Review, Interviewing

Senior Research Fellow

2012 - 2016
CSIR-Central Scientific Instruments Organisation
  • Optimized the design of a minimally invasive surgical robotic arm and formulated the kinematic control for trajectory tracking by its end-effector. Published two papers on this work.
  • Improved the design of a passive bipedal robot and underactuated it in simulation to walk stably on steep slopes of zero to 30 degrees. Published two papers on this work.
  • Taught subjects like industrial control and robotics to diploma-level students.
Technologies: Robotics, Python, Control Systems, Underactuation, Research, Matplotlib, Algorithms, Classification, Scikit-learn, OpenCV, Artificial Intelligence (AI), Computer Vision, Time Series Analysis, Time Series

A Brief Review of Dynamics and Control of Underactuated Biped Robots

This paper summarizes various designs, models, and control strategies used to enable stable walking and running for underactuated biped robots. I was the first author of this publication to work on literature surveys and writing.

The article is available at the following link:

Split Compound Words
Developed and published a GitHub repository for splitting compound words in a text-line into a collection of English dictionary words. If non-English text is given, then an option to first translate to English is also available.

Fake Vs. Real News Classification
I created and published a Kaggle notebook to classify a given news paragraph into fake vs. real, with 95% accuracy. Kaggle dataset for fake vs. real news was used for training and testing. I used the multinomial Naive Bayes for classification and TF-IDF for feature vectorization.

Optimization Using Meta-heuristics
I developed and published a GitHub repository for optimization using popular metaheuristics like tabu search and artificial bee colony optimization (ABC). Tabu search was written from scratch in MATLAB, while ABC was written in Python and is a modified version of the code published by the original author of the algorithm. The modifications were made to enable constrained integer optimization using ABC.

Design Optimization of Minimally Invasive Surgical Robot
We optimized the design within three degrees of freedom serial robotic arm to operate as a minimally invasive robotic surgery (MIRS) arm and attain multiple adjacent possible orifice locations, through which a planar workspace of prespecified geometry can be traced. To achieve this goal, we developed an algorithm to relate the design of such a MIRS arm to the possible orifice positions. The optimization problem was solved using several metaheuristics such as simulated annealing, Tabu search, artificial bee colonization, and genetic algorithm, and their performance was compared.

LighVe: Music Synced Lights

LighVe is a set composed of light strips and a mobile app. LighVe can pick up any music that is being played in the room and synchronize the light visualizations with its beats. I worked on the hardware, circuit design, electronics, and app design while another collaborator developed the backend.

Kinematic Control of An Articulated Minimally Invasive Surgical Robotic Arm
We created geometric transformations based on the constraints acting on the end-link coupled with kinematic-relations obtained using conventional techniques were used to drive a simulated 6-DOF general articulated robotic arm for minimally-invasive operations. This simulated arm verified the method by tracing predefined planar and 3D trajectories.

Clustering Utilities
This repository contains the implementation of new clustering methods and utilities based on recent research papers. For example, incremental agglomerative clustering, given old clusters, maps new data to old clusters, creates new clusters for the unmapped records.

Resume Classification
Unsupervised classification of candidate resumes into testing, development, and management categories using Python.
• It uses latent Dirichlet allocation for topic modeling and counts vectorizer for vectorization.
• It also visualizes groups using Word Cloud.

Algorithm Design
This program allows users to practice problems on algorithm design with solutions and explanations. It includes topics like recursion, sorting, trees, graph search, dynamic programming, and problems asked in screening processes.

Motion Planning
This explores various motion planning algorithms useful in robot motion, including simple motion techniques such as a rapidly exploring random tree (RRT), Dijkstra's algorithm for the shortest path, probabilistic roadmaps (PRM), and more.

Visiting Card Creator
A ChatGPT assistant that provides visiting card designs. The assistant suggests the design based on the user's request with details like purpose, profession, etc. It also creates the images with the suggested design.

Character Imitation by AI: Professor Dumbledore
This ChatGPT assistant talks to the user like Professor Dumbledore from the movie and book series Harry Potter. It imitates the wise, philosophical manner in which the character used to talk in the series.
2010 - 2012

Master's Degree in Mechatronics

Indian Institute of Engineering Science and Technology - Kolkata, India

2005 - 2009

Bachelor's Degree in Electronics and Communication Engineering

Bundelkhand University - Jhansi, India

MAY 2022 - MAY 2025

AWS Certified Cloud Practitioner

Amazon Web Services


Introduction to Containers



Introduction to AWS Elastic Beanstalk



Sequence Models



Introduction to Tensorflow for Artificial Intelligence, Machine Learning, and Deep Learning

DeepLearning.AI | via Coursera


Git Complete: The Definitive, Step-by-step Guide to Git



Neural Networks and Deep Learning

DeepLearning.AI | via Coursera


6.00.1x: Introduction to Computer Science and Programming Using Python



Control of Mobile Robots



Machine Learning



Pandas, PyTorch, LSTM, TensorFlow, Scikit-learn, OpenCV, Keras, Matplotlib, AWS Amplify, Vue, NumPy, React, Beautiful Soup, Node.js


Git, ChatGPT, Named-entity Recognition (NER), MATLAB, You Only Look Once (YOLO), Amazon SageMaker, Confluence, OpenAI Gym, Jupyter




Python, SQL, C++, JavaScript, TypeScript


Data Science


Amazon S3 (AWS S3), Cloud Deployment, Google Cloud


AWS Lambda, Amazon Web Services (AWS), Docker, AWS Elastic Beanstalk


Robotics, Artificial Intelligence (AI), Machine Learning, Deep Learning, Computer Vision, Natural Language Processing (NLP), OCR, Neural Networks, Tesseract, Research, Entity Extraction, Classification, Image Recognition, Data Analysis, OpenAI, Machine Vision, Large Language Models (LLMs), Generative Pre-trained Transformers (GPT), Chatbots, Fine-tuning, Retrieval-augmented Generation (RAG), OpenAI GPT-4 API, Embeddings from Language Models (ELMo), University Teaching, Control Systems, Underactuation, Convolutional Neural Networks (CNN), Object Detection, Transfer Learning, Text Detection, Machine Learning Operations (MLOps), Graphics Processing Unit (GPU), Transformers, BERT, Sequence Models, Algorithms, Time Series Analysis, Time Series, Generative Pre-trained Transformer 3 (GPT-3), GPU Computing, Team Leadership, Cloud, Code Review, Source Code Review, Technical Hiring, Interviewing, Minimum Viable Product (MVP), Open-source LLMs, Image Processing, Mechatronics, Optimization, Metaheuristics, Robot Operating System (ROS), Gated Recurrent Unit (GRU), Containers, Technical Writing, Publication, Simulations, Mathematics, Clustering, Web Scraping, Semantics, Generative Adversarial Networks (GANs), Hugging Face, Unsupervised Learning, Topic Modeling, Data Visualization, DALL-E, OpenAI GPT-3 API, User Feedback, Few-shot Learning, Motion Planning, ChatGPT API, Amazon Bedrock, Data Scraping, PDF Scraping, Recursion Testing

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring