
Rahul Singh Inda
Verified Expert in Engineering
Data Scientist and AI Developer
Bengaluru, Karnataka, India
Toptal member since November 24, 2021
Rahul is a data scientist with three years of professional experience and an engineer's degree focused on big data analytics. He has delivered several edtech solutions, and his areas of expertise include NLP, including state-of-the-art attention-based models, and computer vision using deep learning and classical machine learning techniques.
Portfolio
Experience
- Python 3 - 5 years
- Generative Pre-trained Transformers (GPT) - 5 years
- PyTorch - 5 years
- Natural Language Processing (NLP) - 5 years
- Large Language Models (LLMs) - 5 years
- OpenAI - 5 years
- Retrieval-augmented Generation (RAG) - 4 years
- LangChain - 2 years
Availability
Preferred Environment
Ubuntu, Visual Studio Code (VS Code), Python 3, PyTorch, Python, Data Science, Machine Learning, Deep Learning, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Computer Vision
The most amazing...
...thing I've achieved so far was ranking #241 among global data scientists at the Kaggle competition.
Work Experience
NLP Engineer
Giotto.ai
- Leveraged LLMs such as Gemini to architect retrieval-augmented generation (RAG) systems, significantly advancing the capabilities of internal medical document question-answering systems.
- Created text classification models using semantic similarity to classify documents into 100+ label categories.
- Built question-answering (QA) models using BERT and deployed them on GPU using GCP.
- Managed models in production, including logging and error handling using Google Cloud, Docker, and Grafana.
Lead Data Scientist
Skuad
- Built a neural search engine to solve users' queries using deep learning and FAISS, improving the CTR by 8%.
- Deployed and optimized search to production with around 65,000 to 80,000 daily queries and improved query auto-solves by 25%. Deployed model to production using Google Cloud.
- Implemented a pipeline to group user data based on topics and a deduplication pipeline for content and queries.
Data Scientist
Embibe
- Implemented metadata tagging for academic content with graph nodes for consumer consumption, saving hundreds of person-hours.
- Built an NLP model to tag 10,000+ concepts and derive learning entities using vector-based inferencing to maximize the value of GPU and reduce response time.
- Developed an algorithm for knowledge tracing to model students' knowledge using graph embeddings. The goal is to accurately predict how students will perform in future interactions based on learning activities. The algorithm improved accuracy by 12%.
- Developed the process and led two junior employees in the launch of a doubt resolutions product for students. A vector-based search algorithm returns top-matched questions to users using text and images.
- Built a pipeline for concept tagging YouTube videos. It can download and fetch video transcripts using a text-to-speech API and create a classification model.
- Worked on Google Cloud Run to build a distributed architecture to solve the scalable deployment of deep learning models. Deployed the models with a logging and monitoring dashboard for real-time and batch inference mode.
Experience
Cornell Birdcall Identification
https://www.kaggle.com/rsinda/training-efficientnet-modelYou can read the full description at https://www.kaggle.com/c/birdsong-recognition.
Product Classification API
https://github.com/rsinda/product-classificationIdentify Placement of Tubes in Chest X-rays
https://www.kaggle.com/rsinda/38th-place-solution-0-972-single-model-5-foldEducation
Bachelor's Degree in Computer Science
U. V. Patel College of Engineering - Ahmedabad, Gujarat, India
Certifications
CutShort Certified Deep Learning - Advanced
cutshort
Convolutional Neural Networks
Coursera
Neural Networks and Deep Learning
Coursera
Skills
Libraries/APIs
PyTorch, NumPy, Pandas, Natural Language Toolkit (NLTK), Scikit-learn, PySpark
Tools
ChatGPT
Languages
Python 3, Python
Platforms
Ubuntu, Docker
Storage
MongoDB, Google Cloud
Other
Random Forests, Data Science, Machine Learning, Computer Vision, Retrieval-augmented Generation (RAG), Artificial Intelligence (AI), Vector Databases, Natural Language Processing (NLP), GPU Computing, Computer Vision Algorithms, Deep Learning, Neural Networks, Long Short-term Memory (LSTM), Recurrent Neural Networks (RNNs), FastAPI, APIs, Big Data, Generative Pre-trained Transformers (GPT), Large Language Models (LLMs), Gemini, OpenAI, LangChain, Prompt Engineering, Speech Recognition
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring