
Abhi Panchal
Verified Expert in Engineering
Software Developer
Ahmedabad, Gujarat, India
Toptal member since July 10, 2022
Abhi is an ML and Generative AI engineer with 5+ years of experience delivering production-grade AI solutions. He specializes in LLMs, RAG, diffusion models, transformer architectures, chain-of-thought prompting, fine-tuning, and prompt engineering. Abhi is skilled in deep learning, computer vision, NLP, embeddings, vector databases, and multi-modal systems. He designs scalable ML pipelines with GPU acceleration, low-latency inference, MLOps (CI/CD, monitoring), and containerized deployments.
Portfolio
Experience
- Python 3 - 3 years
Availability
Preferred Environment
Python 3, Large Language Models (LLMs), Chatbots, Agentic AI, Retrieval-augmented Generation (RAG), Machine Learning, Deep Learning, AWS IoT, FastAPI, REST
The most amazing...
...achievement was building a scalable conversational AI system with multi-agent LLM orchestration and voice integration to automate real-time customer support.
Work Experience
Senior AI/ML Engineer
Codiste Pvt
- Architected and led deployment of a production-grade LLM platform serving 50,000+ daily queries, achieving 99.5% uptime and reducing inference costs by 35%.
- Directed fine-tuning of transformer models (T5, GPT-NeoX) on domain-specific corpora, improving downstream task accuracy by 28% and cutting hallucinations by 40%.
- Pioneered a RAG pipeline integrating FAISS and custom embeddings, slashing average document retrieval time to 75 ms and boosting relevance scores by 32%.
- Championed chain-of-thought prompting and prompt-engineering frameworks, increasing answer coherence in multi-step reasoning tasks by 22%.
- Oversaw end-to-end GenAI MLOps CI/CD, model versioning, monitoring, and cost-optimized GPU autoscaling, cutting release cycles from weeks to days.
Machine Learning Engineer
Codiste Pvt. Ltd.
- Developed and deployed traditional ML pipelines (Random Forest, XGBoost) on financial transaction data, achieving 87% accuracy and reducing manual fraud review workload by 60%.
- Engineered a hybrid recommendation system combining collaborative and content‐based filtering, which lifted user click‐through rates by 25% and boosted average session duration by 15%.
- Designed and optimized computer vision models (YOLOv5, ResNet) for defect detection on a 30,000‐image manufacturing dataset, attaining 94% precision and quadrupling inspection throughput.
- Built a music‐classification workflow using MFCC feature extraction and CNNs to categorize 50,000+ tracks into genres with 90% accuracy, enhancing personalized playlist curation.
Graduate Intern
Intel
- Created, characterized, and optimized deep learning workload proxies for healthcare on the Intel Xeon Server platform.
- Provided proof of concept to core architects to measure the performance of Xeon servers or Xeon variants using workload proxies.
- Identified performance bottlenecks in the existing platforms and performed required software optimizations.
Experience
AI-powered Image Editing App with LLM & RAG
The system leverages large language models (LLMs) with prompt engineering and retrieval-augmented generation (RAG) to understand user instructions and apply changes efficiently. The back end utilizes Docker, REST APIs, and Redis for performance optimization. Containerization and DevOps practices ensure a scalable, production-ready SaaS deployment.
AI-based Content Repurposing Platform with LLMOps and Vector Search
Core components include Vault (video storage), Index (FAISS, Pinecone, Qdrant for Vector Data search), and LLM-based modules for transcript generation, voiceover synthesis (Speech Synthesis), and automated reel creation. RAG pipelines and LangChain orchestrate retrieval and generation tasks. DevOps and containerization streamline deployment, while LLMOps ensures scalable AI operations.
LLM-powered Chatbot for Podcasts with Vector Search & RAG
It features automatic episode summarization via LLMs, prompt engineering, and RAG workflows powered by FAISS/Pinecone. An interactive chatbot allows users to query episodes with traceability. The back end also supports contextual ad placement with LangChain and vector data indexing. The platform is containerized using Docker with DevOps pipelines for deployment. Integrated analytics and security features ensure enterprise readiness.
Conversational AI for Calls with Agentic AI and Speech Synthesis
The system supports dynamic LLM-driven dialogues, human-like voice interactions, and real-time vector search for context-aware conversations. Integrated RAG pipelines, LangChain, and LLMOps provide scalability. The platform uses Docker, Redis, REST APIs, and DevOps for enterprise-grade deployment. It also leverages security best practices and supports advanced speech synthesis using models like Claude, Gemini, Anthropic, and Llama.
Education
Master's Degree in Embedded Systems
Institute of Technology, Nirma University - Ahmedabad, India
Bachelor's Degree in Electronics and Communication
Gujarat Technological University - Gandhinagar, India
Certifications
Web Development Bootcamp 2022
Udemy
Crash Course on Python
Coursera
Skills
Libraries/APIs
REST APIs, PyTorch, TensorFlow, Node.js, React, OpenAI API
Tools
Claude, AI Prompts, Azure OpenAI Service, Git, MongoDB Atlas
Languages
Python 3, Python, Regex, JavaScript, SQL, Bash, HTML, CSS
Frameworks
Flask, Streamlit, Next.js
Paradigms
REST, DevOps
Platforms
AWS IoT, Docker, Blockchain, Ollama, Azure, Linux
Storage
Redis, MongoDB, NoSQL
Other
Large Language Models (LLMs), Chatbots, Agentic AI, Retrieval-augmented Generation (RAG), Machine Learning, Deep Learning, FastAPI, Llama, LLVM, Open-source LLMs, Vector Data, Qdrant, Pinecone, FAISS, Weaviate, Containerization, Large Language Model Operations (LLMOps), Computer Vision, OpenAI, Anthropic, Gemini, LangChain, Prompt Engineering, Artificial Intelligence (AI), Blockchain Development, Blockchain Design, ChromaDB, Security, Shell Scripting, Speech Synthesis, Computer Vision Algorithms, AI Chatbots, Speech Recognition
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring