Rahul Kumar, Developer in Gurugram, Haryana, India
Rahul is available for hire
Hire Rahul

Rahul Kumar

Bio

Rahul is a lead AI engineer with over seven years of experience building production-grade AI systems across Web3, fintech, HR tech, edtech, and identity intelligence startups. He has contributed over $150 million in startup growth, authored peer-reviewed research in IEEE and Springer, and holds patents in routing intelligence and predictive modeling. Known for transforming deep AI research into scalable, high-performance systems, Rahul delivers solutions that drive measurable business impact.

Portfolio

Ferret.ai
Large Language Models (LLMs), Machine Learning...
O.XYZ
Python 3, Machine Learning, Deep Learning, Open-source LLMs, AI Agents...
Cognavi India Pvt
Large Language Models (LLMs), AI Agents, Python 3, FastAPI...

Experience

  • Deep Learning - 7 years
  • Python 3 - 7 years
  • Machine Learning - 7 years
  • Data Science - 7 years
  • Large Language Models (LLMs) - 5 years
  • Retrieval-augmented Generation (RAG) - 3 years
  • Ray - 3 years
  • AI Agents - 3 years

Preferred Environment

PyCharm, Slack, MacOS

The most amazing...

...solution I've built was an AI engine that automated more than 10 million background checks with 95% accuracy and delivered real-time insights.

Work Experience

Lead AI Engineer

2025 - PRESENT
Ferret.ai
  • Developed Ferret AI's Identity Intelligence Engine, automating background checks across more than 10 million records with over 95% accuracy. Reduced manual review time by 70% and tripled API performance through a FastAPI migration.
  • Built ETL pipelines processing over 100 million profiles into sharded MongoDB and Neo4j databases, enabling graph-based risk and association scoring. Built modular AI agents with multi-LLM orchestration, cutting research time from hours to minutes.
  • Delivered interactive dashboards and dossier reports using Streamlit and a Notion-style interface, enabling investigators to access real-time risk scores and 360-degree profiles.
  • Partnered with the UAE government, Morgan Stanley, and global enterprises to automate background verification, compliance, and fraud detection through Ferret AI's Identity Intelligence Engine.
Technologies: Large Language Models (LLMs), Machine Learning, Large Language Model Operations (LLMOps), Neo4j, MongoDB, Elasticsearch, Machine Learning Operations (MLOps), Retrieval-augmented Generation (RAG), Artificial Intelligence (AI), Model Context Protocol (MCP), Python, OpenAI API, Claude, Generative Artificial Intelligence (GenAI), Claude API, OpenAI, Data Analysis, Agentic AI, LangGraph, REST APIs, Cloud Services, Docker, Jupyter Notebook, Containers, API Integration, APIs, Automation, Natural Language Processing (NLP), Agentic Frameworks, Web Scraping, Scraping, Data Pipelines, Data Collection, AutoGen, Architecture, Startups, Technical Strategy, AI Tool Assessment, Cross-functional Collaboration, R&D, Solution Architecture, Tech Research & Evaluation, Rapid Prototyping, Custom Automation, Sentiment Analysis, Prompt Engineering, ChatGPT, LangChain, Anthropic, Pattern Analysis, AI Chatbots, AI Programming, AI Design, Full-stack, Microservices, Workflow, Compliance, Azure, Insurance Technology (Insurtech), Statistical Modeling, Statistics, Amazon SageMaker, Amazon Elastic Container Service (ECS), Cybersecurity, Statistical Analysis, Technical Leadership, Back-end Development, Multi-agent Systems, Software Architecture, Product Delivery, Board Reporting, Research, System Architecture, Fractional CTO, Google Cloud Platform (GCP), LlamaIndex, Kubernetes, AI Automation, AWS Glue, Amazon EC2, Amazon EKS, Amazon S3 (AWS S3), ETL, NumPy, Pandas, Data Handling, Data Modeling, Python API, Gurobi, Scientific Data Analysis, Model Deployment, Decision Support, Decision Support Systems (DSS), GPU Computing, JavaScript, Testing, Reliability, RAG Pipelines, Agentic RAG Systems, Pydantic, Vector Databases, Gemini, Next.js, React, TypeScript, ETL Pipelines, Model Validation, A/B Testing, StatsModels, Reinforcement Learning from Human Feedback (RLHF), Data Privacy, Identity & Access Management (IAM), Personally Identifiable Information (PII), Data Scientist, Financial Engineering, AI Architecture, Asynchronous Programming, Cloud Platforms, RAG Architecture, Webhooks, Model Evaluation, Event-driven Architecture

Lead Machine Learning Engineer

2024 - 2025
O.XYZ
  • Built the fastest routing intelligence, outperforming Meta LLaMA-3.1-70B, Qwen2.5-70B, and all open-source routers. Led routing research surpassing BBH, MMLU, MUSR, and GPQA benchmarks, powering Ocean AI with 20x faster performance than Perplexity.
  • Deployed distributed LLM infrastructure on Ray Serve with H100 clusters, scaling hosted models for production workloads. Designed AutoEval LLM Judge and Jailbreak Guard to improve compliance, enhance security, and reduce hallucinations.
  • Engineered optimized RAG pipelines delivering sub-second retrievals and knowledge grounding. Developed a marketing agent platform with on-chain block rewards and conducted foundation model research on Cerebras WSE and LiveKit Voice Agents.
  • Published ORI research at the HK Summit and was featured in Forbes for breakthroughs powering Ocean AI.
Technologies: Python 3, Machine Learning, Deep Learning, Open-source LLMs, AI Agents, AI Voice Agents, Hugging Face, Transformers, FastAPI, Prefect, Grafana, MLflow, Ray, Machine Learning Operations (MLOps), Retrieval-augmented Generation (RAG), Artificial Intelligence (AI), Python, OpenAI API, Claude, Generative Artificial Intelligence (GenAI), Claude API, OpenAI, Amazon Bedrock, Data Analysis, Agentic AI, LangGraph, REST APIs, Cloud Services, Docker, Jupyter Notebook, Containers, LiveKit, WebRTC, Audio, Asyncio, Videos, Django, SQL, Data Analytics, API Integration, APIs, Automation, Natural Language Processing (NLP), Agentic Frameworks, Scraping, Data Pipelines, Data Collection, AutoGen, CrewAI, Text-to-Speech (TTS), Architecture, TensorFlow, Technical Project Management, Startups, Technical Strategy, AI Tool Assessment, Cross-functional Collaboration, R&D, Solution Architecture, Tech Research & Evaluation, Rapid Prototyping, Custom Automation, Data Engineering, Sentiment Analysis, Prompt Engineering, ChatGPT, LangChain, Large Language Models (LLMs), Anthropic, Pattern Analysis, Image Generation, AI Chatbots, Conversational AI, AI Programming, AI Design, Full-stack, Microservices, Workflow, Compliance, Azure, Statistics, Trading, Risk Management, Algorithmic Trading, Quantitative Analysis, Amazon SageMaker, Multimodal GenAI, Amazon Elastic Container Service (ECS), Cybersecurity, Image Analysis, Attribution Modeling, Marketing Analytics, Front-end Development, Technical Leadership, Back-end Development, Multi-agent Systems, Software Architecture, CTO, Product Delivery, Board Reporting, Research, System Architecture, Google Cloud Platform (GCP), LlamaIndex, Ollama, Kubernetes, AI Automation, AWS Glue, Amazon EC2, Amazon EKS, Amazon Kinesis, Amazon S3 (AWS S3), ETL, NumPy, Pandas, Data Handling, Data Modeling, Python API, Gurobi, Model Deployment, Decision Support, Decision Support Systems (DSS), GPU Computing, IT Management, Workflow Automation, Design, Testing, RAG Pipelines, AI Assistants, Agentic RAG Systems, Pydantic, Vector Databases, Gemini, ETL Pipelines, Model Validation, A/B Testing, Spatial Analysis, Reinforcement Learning from Human Feedback (RLHF), Identity & Access Management (IAM), Data Scientist, Bitcoin, Bayesian Inference & Modeling, High-frequency Trading (HFT), Prediction Markets, Quantitative Finance, Cryptocurrency, AI Architecture, Asynchronous Programming, Cloud Platforms, RAG Architecture, Webhooks, Model Evaluation, Event-driven Architecture

Lead AI Engineer

2023 - 2024
Cognavi India Pvt
  • Implemented RAG pipelines integrated with Neo4j and Groq LLM inference, enabling sub-second contextual retrieval and reasoning. Built FAISS-based retrieval for over 8.4 million job posts, cutting query latency by 65% and improving retrieval accuracy.
  • Managed ETL pipelines processing 64+ million records with automated refresh cycles, reducing costs by 30% via SHA-based deduplication. Applied RLHF, PEFT, and LLM fine-tuning to boost model response quality by more than 20% on evaluation benchmarks.
  • Built a data pipeline for 150+ million LinkedIn job posts and crafted AI-based job matching with patented digital profile technology, improving match accuracy by 35% and doubling engagement. Created a GPT-based resume builder and screening assistant.
Technologies: Large Language Models (LLMs), AI Agents, Python 3, FastAPI, Amazon Web Services (AWS), MongoDB, Vector Data, LangChain, Machine Learning Operations (MLOps), Retrieval-augmented Generation (RAG), Artificial Intelligence (AI), Python, OpenAI API, Claude, Generative Artificial Intelligence (GenAI), OpenAI, Amazon Bedrock, Data Analysis, Agentic AI, LangGraph, REST APIs, Cloud Services, Docker, Jupyter Notebook, Containers, Asyncio, Django, SQL, Data Analytics, API Integration, APIs, Automation, Natural Language Processing (NLP), Agentic Frameworks, Web Scraping, Scraping, Data Pipelines, Data Collection, AutoGen, CrewAI, Architecture, PyTorch, TensorFlow, Technical Project Management, Startups, Technical Strategy, AI Tool Assessment, Cross-functional Collaboration, R&D, Solution Architecture, Tech Research & Evaluation, Rapid Prototyping, Custom Automation, Data Engineering, Sentiment Analysis, Prompt Engineering, ChatGPT, Anthropic, Pattern Analysis, Image Generation, AI Chatbots, Conversational AI, AI Programming, AI Design, Full-stack, Microservices, Workflow, Azure, Recommendation Systems, Statistical Modeling, Statistics, Amazon SageMaker, Amazon Elastic Container Service (ECS), Cybersecurity, Image Analysis, Attribution Modeling, Demand Forecasting, Marketing Analytics, Statistical Analysis, Predictive Analytics, Conversion, Funnel Marketing, Technical Leadership, Back-end Development, Multi-agent Systems, Software Architecture, CTO, Product Delivery, Board Reporting, Research, System Architecture, Google Cloud Platform (GCP), Databricks, LlamaIndex, Ollama, Kubernetes, AI Automation, AWS Glue, Amazon EC2, Amazon EKS, ETL, NumPy, Pandas, Data Handling, Data Modeling, Python API, Gurobi, Scientific Data Analysis, Model Deployment, Decision Support, Decision Support Systems (DSS), GPU Computing, IT Management, JavaScript, Workflow Automation, Design, Testing, Reliability, RAG Pipelines, AI Assistants, Agentic RAG Systems, Pydantic, Vector Databases, Looker, Gemini, Next.js, React, TypeScript, ETL Pipelines, Forecasting, Model Validation, A/B Testing, Reinforcement Learning from Human Feedback (RLHF), Identity & Access Management (IAM), Data Scientist, Marketing Mix Modeling, AI Architecture, Asynchronous Programming, Cloud Platforms, RAG Architecture, Webhooks, Model Evaluation, Event-driven Architecture

Data Science Manager

2020 - 2022
Laytrip Inc
  • Spearheaded data pipelines from AWS databases to BigQuery, aggregating over 300 million rows from multiple travel APIs on GCP. Designed predictive models for fare prediction, demand forecasting, and arbitrage optimization.
  • Invented and patented an arbitrage model powering dynamic pricing. Implemented MLOps workflows for continuous training and deployment and automated Slack and Telegram bots, reducing manual operations by 40%.
  • Created Cloud Function APIs for seamless partner integration. Collaborated with product, engineering, and operations teams, helping secure $300,000 seed funding from Airbus to scale the MVP and expand predictive analytics capabilities.
Technologies: Google BigQuery, Google Cloud Platform (GCP), Machine Learning, Deep Learning, XGBoost, AutoML, Data Science, PostgreSQL, Machine Learning Operations (MLOps), Artificial Intelligence (AI), Python, OpenAI API, Amazon Bedrock, Data Analysis, REST APIs, Cloud Services, Docker, Jupyter Notebook, Containers, Asyncio, Django, SQL, Data Analytics, API Integration, APIs, Automation, Natural Language Processing (NLP), Agentic Frameworks, Web Scraping, Scraping, Data Pipelines, Data Collection, AutoGen, Architecture, PyTorch, Vertex AI, TensorFlow, Technical Project Management, Startups, Technical Strategy, AI Tool Assessment, Cross-functional Collaboration, R&D, Solution Architecture, Tech Research & Evaluation, Rapid Prototyping, Custom Automation, Data Engineering, Pattern Analysis, AI Programming, AI Design, Full-stack, Financial Systems, Banking & Finance, Microservices, Workflow, Statistical Modeling, Statistics, Trading, Risk Management, Algorithmic Trading, Quantitative Analysis, Amazon SageMaker, Cybersecurity, Microsoft Power Automate, Attribution Modeling, Demand Forecasting, Marketing Analytics, Statistical Analysis, Markov Model, Predictive Analytics, Funnel Marketing, Technical Leadership, Back-end Development, Software Architecture, CTO, Product Delivery, Board Reporting, Research, Finance, System Architecture, Databricks, AWS Glue, Amazon EC2, Amazon Kinesis, Amazon S3 (AWS S3), ETL, NumPy, Pandas, Data Handling, Data Modeling, Python API, Gurobi, Scientific Data Analysis, Model Deployment, Decision Support, Decision Support Systems (DSS), GPU Computing, JavaScript, Workflow Automation, Design, Testing, Reliability, Pydantic, Looker, Next.js, React, TypeScript, ETL Pipelines, Forecasting, Model Validation, Spatial Analysis, BigQuery, StatsModels, ARIMA, Identity & Access Management (IAM), Data Scientist, Marketing Mix Modeling, Financial Engineering, Prediction Markets, Quantitative Finance, AI Architecture, Asynchronous Programming, Cloud Platforms, Webhooks, Model Evaluation, Event-driven Architecture

Founding Data Scientist

2019 - 2022
DataisGood
  • Designed and developed ML, DL, and CV projects and courses, training more than 450 aspiring data scientists. Created quarterly roadmaps to expand course offerings, increasing learner engagement and platform adoption.
  • Spearheaded career transition initiatives, affiliate programs, and university partnerships, expanding brand reach and credibility across global markets and driving measurable enrollment growth.
  • Applied advanced analytics and KPI tracking to optimize business performance. The platform was later acquired by Skill Arbitrage for $3 million, validating the scalability and impact of these contributions.
Technologies: Machine Learning, Deep Learning, Data Science, Artificial Intelligence (AI), Python, OpenAI API, Data Analysis, Jupyter Notebook, Containers, SQL, Data Analytics, API Integration, APIs, Automation, Natural Language Processing (NLP), Agentic Frameworks, Scraping, Data Pipelines, Data Collection, AutoGen, TensorFlow, Technical Project Management, Startups, Technical Strategy, AI Tool Assessment, Cross-functional Collaboration, R&D, Solution Architecture, Rapid Prototyping, Custom Automation, Sentiment Analysis, Pattern Analysis, AI Programming, AI Design, Full-stack, Workflow, Statistics, Image Analysis, Statistical Analysis, Markov Model, Predictive Analytics, Conversion, Funnel Marketing, Back-end Development, Amazon Kinesis, ETL, Pandas, Data Modeling, Python API, Gurobi, Model Deployment, Data Scientist, Marketing Mix Modeling

Experience

Ocean AI Platform

https://ocean.o.xyz
Ocean AI is a next-generation reasoning platform designed for ultra-fast inference and intelligent routing. Built with a focus on adaptability and performance, it leverages optimized routing intelligence (ORI) to dynamically select the best model for each query, outperforming static pipelines such as Perplexity.

Powered by Groq LPU acceleration and distributed Ray Serve clusters, Ocean AI delivers up to 20x faster inference, enabling real-time, multi-model orchestration across OpenAI, Anthropic, Mistral, and custom fine-tuned models.

Through continuous benchmarking on BBH, MMLU, and GPQA, ORI ensures every request is routed to the most capable engine—balancing speed, accuracy, and cost efficiency for optimal results.

Laytrip Predictive Booking

Developed a predictive booking and airline price arbitrage engine that analyzed billions of fare records to forecast optimal booking windows and pricing fluctuations. Designed ML models leveraging time-series forecasting, demand elasticity modeling, and route-specific regression analysis to predict fare volatility with high accuracy. Deployed the pipeline as a scalable microservice connected to real-time airline APIs, powering automated fare monitoring and dynamic rebooking recommendations. This system formed the foundation of Laytrip’s patented arbitrage model and helped secure $300,000 in Airbus funding for the MVP.

O Routing Intelligence

https://arxiv.org/abs/2502.10051
Designed and deployed the ORI Routing Model, an intelligent routing engine that dynamically optimized model selection, prompt distribution, and response aggregation across multiple large-language models. The system outperformed industry benchmarks and models, such as Meta LLaMA-70B, DeepSeek-67B, and Qwen-72B, in both cost and latency. Built with modular routing logic, adaptive token budgeting, and performance-aware dispatching, ORI intelligently selected the most efficient inference path in real time. It integrated with hybrid compute back ends (OpenAI, Groq, SambaNova, and Cerebras) and leveraged continuous evaluation feedback to improve routing precision. The platform reduced average inference latency by over 60% and enabled 20× faster throughput in Ocean, the intelligent search engine built on top of ORI.

Education

2022 - 2023

Postgraduate Degree in Data Science

California Institute of Technology - Pasadena, CA, USA

2016 - 2020

Bachelor's Degree in Computer Science

Lovely Professional University (LPU) - Phagwara, India

Skills

Libraries/APIs

XGBoost, OpenAI API, Claude API, REST APIs, WebRTC, PyTorch, TensorFlow, NumPy, Pandas, Pydantic, Asyncio, Python API, React, Hugging Face Transformers

Tools

Claude, ChatGPT, Amazon SageMaker, Amazon Elastic Container Service (ECS), AWS Glue, Amazon EKS, Looker, BigQuery, PyCharm, Slack, Prefect, Grafana, Gurobi, StatsModels, ARIMA, AutoML

Languages

Python 3, Python, SQL, JavaScript, TypeScript

Frameworks

LangGraph, AutoGen, Django, Agentic Frameworks, LlamaIndex, Ray, Next.js

Paradigms

Rapid Prototyping, Microservices, ETL, Testing, Model Context Protocol (MCP), Automation, Asynchronous Programming, Event-driven Architecture

Platforms

Google Cloud Platform (GCP), Jupyter Notebook, LiveKit, CrewAI, Vertex AI, Amazon EC2, MacOS, Amazon Web Services (AWS), Docker, Azure, Databricks, Ollama, Kubernetes, Microsoft Power Automate

Storage

Data Pipelines, Amazon S3 (AWS S3), Neo4j, MongoDB, Elasticsearch, PostgreSQL

Industry Expertise

Banking & Finance, Cybersecurity, High-frequency Trading (HFT)

Other

Machine Learning, Deep Learning, Data Science, Large Language Models (LLMs), AI Voice Agents, LangChain, Google BigQuery, Machine Learning Operations (MLOps), Retrieval-augmented Generation (RAG), Artificial Intelligence (AI), Generative Artificial Intelligence (GenAI), OpenAI, Amazon Bedrock, Data Analysis, Agentic AI, Cloud Services, Data Analytics, API Integration, Natural Language Processing (NLP), Web Scraping, Scraping, Data Collection, Text-to-Speech (TTS), Architecture, Technical Project Management, Startups, Technical Strategy, AI Tool Assessment, Cross-functional Collaboration, R&D, Solution Architecture, Tech Research & Evaluation, Custom Automation, Data Engineering, Sentiment Analysis, Prompt Engineering, Anthropic, Pattern Analysis, Image Generation, AI Chatbots, Conversational AI, AI Programming, AI Design, Full-stack, Workflow, Statistical Modeling, Attribution Modeling, Demand Forecasting, Marketing Analytics, Statistical Analysis, Predictive Analytics, Conversion, Funnel Marketing, Technical Leadership, Back-end Development, Software Architecture, Research, System Architecture, AI Automation, Data Handling, Data Modeling, Model Deployment, GPU Computing, Workflow Automation, Design, Reliability, AI Assistants, Agentic RAG Systems, ETL Pipelines, Forecasting, A/B Testing, Cloud Platforms, AI Agents, Hugging Face, Transformers, MLflow, FastAPI, Vector Data, Containers, Audio, Videos, APIs, Financial Systems, Compliance, Insurance Technology (Insurtech), Recommendation Systems, Statistics, Trading, Risk Management, Algorithmic Trading, Quantitative Analysis, Multimodal GenAI, Image Analysis, Markov Model, Multi-agent Systems, CTO, Product Delivery, Board Reporting, Finance, Fractional CTO, Amazon Kinesis, Scientific Data Analysis, Decision Support, Decision Support Systems (DSS), IT Management, RAG Pipelines, Vector Databases, Gemini, Model Validation, Spatial Analysis, Reinforcement Learning from Human Feedback (RLHF), Data Privacy, Identity & Access Management (IAM), Personally Identifiable Information (PII), Data Scientist, Marketing Mix Modeling, Bitcoin, Bayesian Inference & Modeling, Financial Engineering, Prediction Markets, AI Architecture, RAG Architecture, Webhooks, Model Evaluation, Large Language Model Operations (LLMOps), Open-source LLMs, Front-end Development, Quantitative Finance, Cryptocurrency

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring