
Rajeev Gupta
Verified Expert in Engineering
Artificial Intelligence (AI) Developer
Delhi, India
Toptal member since July 22, 2019
Rajeev is passionate about data and machine learning and has more than five years of experience in data science projects across numerous industries and applications. He's currently focused on cutting-edge technologies such as TensorFlow, Keras, deep learning, and most of the Python data science stack. Rajeev has used these skills to solve many real business problems in NLP, image processing, and time series domains.
Portfolio
Experience
- Artificial Intelligence (AI) - 5 years
- Machine Learning - 5 years
- Data Science - 5 years
- Image Processing - 4 years
- Generative Pre-trained Transformers (GPT) - 4 years
- Deep Learning - 4 years
- Natural Language Processing (NLP) - 4 years
- Keras - 3 years
Preferred Environment
Google Cloud, Jupyter Notebook, Spyder, Git
The most amazing...
...project I've implemented was a NLP attention boosted sequential inference model to automate one of the business processes.
Work Experience
AI Engineer
InteGrow AI
- Analyzed over 10+ existing n8n workflows across departments and identified critical performance bottlenecks that caused a 30–40% delay in execution times during peak usage hours.
- Designed a detailed migration blueprint for transitioning from n8n to a high-performance, Python-based agentic framework, reducing future workflow latency by an estimated 60%.
- Recommended the adoption of Google Agent Development Kit after benchmarking four alternative frameworks, citing its 3x faster response time, better resource parallelization, and native LLM integration.
- Developed custom Python wrappers and orchestration logic to replicate n8n automations within the new framework, enabling a smooth 100% functional parity in migrated flows.
- Delivered post-migration performance reports demonstrating a 2.5x improvement in automation execution speed, with error rates dropping from 8.4% to under 1.5%.
Data Engineer and Analyst
Big Happy LLC.
- Collaborated with a leading client operating a large-scale mobile advertising platform in the US and Canadian markets.
- Enhanced campaign performance and optimised bid budget to reduce advertising costs by leveraging two years of historical ad data alongside real-time signals.
- Developed and deployed bid optimisation models to generate automated insights and actionable recommendations, leading to a 20% reduction in campaign budgets.
- Designed and delivered an AI-powered business intelligence dashboard, enabling the marketing team to track and optimize campaign performance metrics such as CTR, CPM, CPC, and ROI.
- Improved operational efficiency, resulting in a 25% increase in productivity for daily campaign management tasks.
- Utilized the Python data science stack (Pandas, NumPy, scikit-learn, Matplotlib, Plotly) for data analysis, modeling, and visualization.
AI Engineer
EduTech Startup
- Played a key role in developing an AI-powered personalized learning platform aimed at transforming student engagement and academic performance through real-time customization and intelligent automation.
- Leveraged LangChain, large language models (LLMs), and Pinecone vector database to dynamically generate lesson content tailored to each student’s learning profile, including preferences, subject strengths, and knowledge gaps.
- Developed a real-time student emotion recognition module using webcam data and computer vision techniques to adjust content delivery and pace based on emotional feedback (e.g., confusion, boredom, engagement).
- Implemented an intelligent assistant using RAG pipelines and custom-trained LLMs, enabling students to ask questions and receive step-by-step solutions contextualized to their curriculum and grade level.
- Built dashboards and analytics pipelines using the Python data science stack to monitor student progress, personalize learning paths, and provide educators and curriculum designers with insights.
Data Developer
Availyst LLC
- Worked with a US-based food aggregator startup on data engineering and scraping, using the Python data science stack, Jupyter Notebook, and AWS services.
- Handled the recommendation engine for the user, a food and restaurant recommendation.
- Developed the scraping application using Python and deployed it using AWS services.
Data Scientist – Fintech Project
Forbes Media - Q.ai
- Managed the business intelligence team, acting as a senior data scientist for the client.
- Worked as a quant researcher, using advanced forms of quantitative techniques and artificial intelligence to generate investment recommendations across multiple asset classes, including stocks, ETFs, options, and cryptocurrencies.
- Created a dashboard for the growth and marketing and leadership teams using Dash, Plotly, and Tableau.
Senior Data Scientist | Data Analyst
BCG
- Served as a data scientist and senior analyst, collaborating closely with the client and their team.
- Worked on demand space segmentation for a large US fashion retailer.
- Mapped 6 million customer data to the demand space segment.
Data Scientist
A Telecommunications and Media Company in the US
- Worked with a telecommunications and media company in the US on identifying fake news.
- Developed two models to identify sarcasm and quantification fallacies in articles.
Independent Consultant – Data Scientist
IBM
- Worked for IBM US to optimize its US facility leases to run its operation.
- Developed a Python model to improve facility utilization, reduce facility operations cost and reduce lease cost along with number of business constraints.
Independent Consultant | Data Scientist
Independent Consultant
- Associated with JSS Information Technology Business Incubator as a data science mentor.
- Helped small companies and startups take advantage of their data.
- Created predictive models using machine learning and deep learning.
- Worked with natural language processing with neural networks.
- Developed classification and regression algorithms.
- Implemented time-series forecasting for various industry domains.
Independent Consultant – Data Scientist
AbbVie, Inc.
- Worked closely with the C-level executive and product management team to analyze the survey and produced data/reports.
- Helped the product team and executive team to make more informed decisions—increasing market share through the identification of new opportunity, target segments and devising ingenious new ways of resolving constraints.
Independent Consultant – Data Scientist
Newristics
- Developed a Python app which uses natural language processing with deep neural networks sequence to sequence learning to automate business process.
- Reduced the cost of business operations.
Data Scientist
Sopra Steria Singapore
- Worked with the Land Transport Authority, Singapore to implement the vision to convert the city into a digital and intelligent one to improve the efficiency of services for the citizens, using machine learning, predictive modeling, and data mining.
Data Scientist
Steria India
- Built a recommendation system for an eCommerce site; it recommended the best possible items to buy based on customer history and collaborative filtering.
- Helped with customer churn prediction by developing a classification algorithm for a retail bank to identify customers likely to churn balances in the next quarter by at least 50% vis-a-vis current quarter.
- Created a classification algorithm for a retail bank to improve sales from existing customers by cross-selling one of its product, the personal loan (customer cross-sales).
Technical Program Manager
Steria India — Barclays Bank
- Set up business benefits of around £43 million over five years in customer retention, cost savings, and new business opportunities at an estimated cost of around £12 million.
- Acted as a vital member of the steering committee that identified user needs and developed customized solutions for around 250,000 Barclaycard acquiring merchants.
- Led a project team of 147 members including solution architects, designers, developers, and testers spread across multi-geographical locations through the entire project development life cycle.
- Consistently stayed within around 5% of resource and budget forecast monthly.
- Recognized as problem solver within a team of 22 project managers in the portfolio of annual spend over £70 million.
Experience
IBM
I developed the Python integer programming algorithm to solve this problem. Considering the business constraints made this problem interesting and unique. I parameterized the optimization period (the period to look into the future) in the algorithm to provide multiple solutions. The client especially appreciated this feature.
Technologies: Python, Plotly, Linear Programming, Package Pulp
Newristics
I automated the message scorer process where a team compares the new message against the old one and analyzes it to rate how closely it depicts the heuristic.
Text data is then preprocessed with text cleaning, text normalization, and generated unigram bigram of normalized data. I built two main models to solve this problem: XGBoost and deep neural network seq-to-seq learning.
For XGBoost, I created around 900 features (divided into three sections).
• NLP basic features: count/ratio of words/character of the message, TF-IDF of unigram/bigram, gensim TF-IDF similarity, and so on
• Word embedding—similarity of self/pre-trained Word2vec/GloVe-weighted average embedding vectors (TF-IDF as weight), etc.
• Graph—degree of nodes, the intersection of neighbors, k-core/k-clique, degree of separation, etc.
I used the deep learning seq-to-seq model to enhance the sequence inference neural network architecture.
Technologies: Python, LSTM, gensim, GloVe, SpaCy, NLTK, Scikit-learn, TensorFlow, Keras, Jupyter Notebook, Git, Google Cloud Platform
AbbVie, Inc.
We interviewed 119 physicians about HCV regiment attributes which impact the market driver, 55 physicians concerning patient treatment, and 60 physicians about sales rep interaction and their impression about the message and interaction.
I worked closely with the C-level executive and product management team to analyze the survey and produced data/reports. This helped the product team and executive team to make more informed decisions—increasing market share through the identification of new opportunity, target segments, and devising ingenious new ways of resolving constraints.
Technologies: Python, R, Plotly, Matplotlib, Regression, Cluster, Association Rule
Classify H&E Stained Histological Breast Cancer Images
Technologies: Python 3, Keras, NumPy, Pandas, SciPy, Scikit-learn
Demand Forecast at an SKU-level for a Brewery Company
In order to plan its production and distribution as well as help wholesalers with their planning, it is important for them to have an accurate estimate of demand at SKU level (34) for each wholesaler (60).
Data: Four years of data of 60 agencies and 34 SKUs are used for prediction.
• Price sales promotion (dollar/hectoliter): The price, sales, and promotion in dollar value per hectoliter at an agency-SKU-month level
• Historical volume (hectoliters): Sales data at an agency-SKU-month level
• Weather (degree celsius): The average maximum temperature at an agency-month level
• Industry soda sales (hectoliters): Industry-level soda sales
• Event calendar: Event details (sports, carnivals, and so on)
• Industry volume (hectoliters): Industry actual beer volume
• Demographics: Demographic details (yearly income in dollars); used deep neural networks sequence to sequence learning for demand prediction
Satellite Imagery Feature Detection Using Deep Learning
Education
Master's Degree in Computer Science
Jawaharlal Nehru University - New Delhi, India
Bachelor's Degree in Mathematics
Delhi University - Delhi, India
Skills
Libraries/APIs
TensorFlow, TensorFlow Deep Learning Library (TFLearn), Matplotlib, Scikit-learn, Pandas, NumPy, XGBoost, CatBoost, Keras, Claude API, PyTorch, SciPy, Dask, LSTM, SpaCy, Natural Language Toolkit (NLTK), PySpark, OpenAI API, llama.cpp
Tools
Jupyter, GitHub, Seaborn, ChatGPT, Claude Code, Claude Agent SDK, Microsoft Copilot, Codex, Plotly, Git, Spyder, Gensim, Cluster, Tableau, JCL, Ab Initio, Amazon Elastic MapReduce (EMR)
Languages
Python, Python 3, SQL, R, CICS, COBOL, Java, XML, JavaScript, CSS
Frameworks
LightGBM, Apache Spark, Streamlit, LangGraph, Agentic Frameworks
Platforms
Docker, Visual Studio Code (VS Code), Amazon Web Services (AWS), Jupyter Notebook, Google Cloud Platform (GCP), WebSphere, Oracle, Tango
Storage
Data Pipelines, Google Cloud, IBM Db2, Virtual Storage Access Method (VSAM), MySQL, PostgreSQL
Paradigms
Agile Software Development, Linear Programming, Model Context Protocol (MCP)
Other
Data Analysis, Data Analytics, Data Scraping, Data Engineering, Quantitative Modeling, Quantitative Analysis, Mixed-integer Linear Programming, Deep Learning, Deep Neural Networks (DNNs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-term Memory (LSTM), Natural Language Processing (NLP), Image Processing, Time Series Analysis, Data Science, Artificial Intelligence (AI), Machine Learning, Modeling, Statistical Modeling, Statistical Methods, Statistical Learning, Analytics, Generative Pre-trained Transformers (GPT), ChatGPT API, ChatGPT Prompts, AI Agents, Agentic AI Systems, AI Agent Orchestration, AI Voice Agents, LLM Reasoning, LLM Integration, Video Chat, Computer Vision, Image Segmentation, Image Analysis, Conversational AI, Conversational Agent, AI Chatbots, Prompt Engineering, RAG Architecture, RAG Pipelines, Agentic RAG Systems, RAG Systems, visual studio code, Chatbots, Statistics, Numba, Optimization, Reinforcement Learning, Deep Reinforcement Learning, Dash, GloVe, Regression, Association Rule Learning, Classification, Content Management, Scraping, Generative Artificial Intelligence (GenAI), Multimodal GenAI, LangChain, Retrieval-augmented Generation (RAG), Large Language Models (LLMs), Pinecone, FastAPI, LoRa, Agentic AI
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring