Abhi Panchal, Developer in Ahmedabad, Gujarat, India
Abhi is available for hire
Hire Abhi

Abhi Panchal

Verified Expert  in Engineering

Bio

Abhi is an ML and Generative AI engineer with 5+ years of experience delivering production-grade AI solutions. He specializes in LLMs, RAG, diffusion models, transformer architectures, chain-of-thought prompting, fine-tuning, and prompt engineering. Abhi is skilled in deep learning, computer vision, NLP, embeddings, vector databases, and multi-modal systems. He designs scalable ML pipelines with GPU acceleration, low-latency inference, MLOps (CI/CD, monitoring), and containerized deployments.

Portfolio

Codiste Pvt
Python 3, REST APIs, Node.js, Bash, Git, Flask, SQL, NoSQL, Python, MongoDB...
Codiste Pvt. Ltd.
Python 3, REST APIs, Node.js, Bash, Git, Flask, SQL, NoSQL, Python, MongoDB...
Intel
Python 3, Bash, Shell Scripting, Python, Artificial Intelligence (AI)...

Experience

  • Python 3 - 3 years

Availability

Full-time

Preferred Environment

Python 3, Large Language Models (LLMs), Chatbots, Agentic AI, Retrieval-augmented Generation (RAG), Machine Learning, Deep Learning, AWS IoT, FastAPI, REST

The most amazing...

...achievement was building a scalable conversational AI system with multi-agent LLM orchestration and voice integration to automate real-time customer support.

Work Experience

Senior AI/ML Engineer

2021 - PRESENT
Codiste Pvt
  • Architected and led deployment of a production-grade LLM platform serving 50,000+ daily queries, achieving 99.5% uptime and reducing inference costs by 35%.
  • Directed fine-tuning of transformer models (T5, GPT-NeoX) on domain-specific corpora, improving downstream task accuracy by 28% and cutting hallucinations by 40%.
  • Pioneered a RAG pipeline integrating FAISS and custom embeddings, slashing average document retrieval time to 75 ms and boosting relevance scores by 32%.
  • Championed chain-of-thought prompting and prompt-engineering frameworks, increasing answer coherence in multi-step reasoning tasks by 22%.
  • Oversaw end-to-end GenAI MLOps CI/CD, model versioning, monitoring, and cost-optimized GPU autoscaling, cutting release cycles from weeks to days.
Technologies: Python 3, REST APIs, Node.js, Bash, Git, Flask, SQL, NoSQL, Python, MongoDB, Agentic AI, AWS IoT, Llama, LLVM, Open-source LLMs, Large Language Models (LLMs), Retrieval-augmented Generation (RAG), Vector Data, Qdrant, Pinecone, FAISS, Weaviate, PyTorch, TensorFlow, Docker, Containerization, DevOps, Large Language Model Operations (LLMOps), Computer Vision, OpenAI, Claude, Anthropic, Gemini, Chatbots, LangChain, Prompt Engineering, Redis, Artificial Intelligence (AI), Blockchain, Blockchain Development, Blockchain Design, ChromaDB, Ollama, Regex, Streamlit, AI Prompts, Azure, Azure OpenAI Service

Machine Learning Engineer

2020 - 2022
Codiste Pvt. Ltd.
  • Developed and deployed traditional ML pipelines (Random Forest, XGBoost) on financial transaction data, achieving 87% accuracy and reducing manual fraud review workload by 60%.
  • Engineered a hybrid recommendation system combining collaborative and content‐based filtering, which lifted user click‐through rates by 25% and boosted average session duration by 15%.
  • Designed and optimized computer vision models (YOLOv5, ResNet) for defect detection on a 30,000‐image manufacturing dataset, attaining 94% precision and quadrupling inspection throughput.
  • Built a music‐classification workflow using MFCC feature extraction and CNNs to categorize 50,000+ tracks into genres with 90% accuracy, enhancing personalized playlist curation.
Technologies: Python 3, REST APIs, Node.js, Bash, Git, Flask, SQL, NoSQL, Python, MongoDB, Agentic AI, AWS IoT, Llama, LLVM, Open-source LLMs, Large Language Models (LLMs), Retrieval-augmented Generation (RAG), Vector Data, Qdrant, Pinecone, FAISS, Weaviate, PyTorch, TensorFlow, Docker, Containerization, DevOps, Large Language Model Operations (LLMOps), Computer Vision, OpenAI, Claude, Anthropic, Gemini, Chatbots, LangChain, Prompt Engineering, Redis, Artificial Intelligence (AI), Blockchain, Blockchain Development, Blockchain Design, ChromaDB, Ollama, Regex, Streamlit, AI Prompts, Azure, Azure OpenAI Service

Graduate Intern

2018 - 2019
Intel
  • Created, characterized, and optimized deep learning workload proxies for healthcare on the Intel Xeon Server platform.
  • Provided proof of concept to core architects to measure the performance of Xeon servers or Xeon variants using workload proxies.
  • Identified performance bottlenecks in the existing platforms and performed required software optimizations.
Technologies: Python 3, Bash, Shell Scripting, Python, Artificial Intelligence (AI), Blockchain, Blockchain Development, Blockchain Design, Streamlit

Experience

AI-powered Image Editing App with LLM & RAG

This advanced web application enables users to edit images using natural language prompts through an AI-first interface. Built with Python 3, FastAPI, and OpenAI API, it supports a wide range of editing functions including cropping, scaling, resizing, color correction, brightness, and contrast adjustments.

The system leverages large language models (LLMs) with prompt engineering and retrieval-augmented generation (RAG) to understand user instructions and apply changes efficiently. The back end utilizes Docker, REST APIs, and Redis for performance optimization. Containerization and DevOps practices ensure a scalable, production-ready SaaS deployment.

AI-based Content Repurposing Platform with LLMOps and Vector Search

A versatile SaaS platform for enterprise content repurposing, built with Python 3, MongoDB, FastAPI, and Docker. It automates video management and enhancement using deep learning and computer vision.

Core components include Vault (video storage), Index (FAISS, Pinecone, Qdrant for Vector Data search), and LLM-based modules for transcript generation, voiceover synthesis (Speech Synthesis), and automated reel creation. RAG pipelines and LangChain orchestrate retrieval and generation tasks. DevOps and containerization streamline deployment, while LLMOps ensures scalable AI operations.

LLM-powered Chatbot for Podcasts with Vector Search & RAG

An advanced AI-powered chatbot for podcasts designed to enhance listener engagement and provide actionable insights. It was built using Python 3, FastAPI, MongoDB, and Redis.

It features automatic episode summarization via LLMs, prompt engineering, and RAG workflows powered by FAISS/Pinecone. An interactive chatbot allows users to query episodes with traceability. The back end also supports contextual ad placement with LangChain and vector data indexing. The platform is containerized using Docker with DevOps pipelines for deployment. Integrated analytics and security features ensure enterprise readiness.

Conversational AI for Calls with Agentic AI and Speech Synthesis

This conversational AI platform automates inbound and outbound voice interactions for customer support and lead generation. It was built using agentic AI principles, Python 3, FastAPI, speech synthesis, and OpenAI API.

The system supports dynamic LLM-driven dialogues, human-like voice interactions, and real-time vector search for context-aware conversations. Integrated RAG pipelines, LangChain, and LLMOps provide scalability. The platform uses Docker, Redis, REST APIs, and DevOps for enterprise-grade deployment. It also leverages security best practices and supports advanced speech synthesis using models like Claude, Gemini, Anthropic, and Llama.

Education

2017 - 2019

Master's Degree in Embedded Systems

Institute of Technology, Nirma University - Ahmedabad, India

2013 - 2017

Bachelor's Degree in Electronics and Communication

Gujarat Technological University - Gandhinagar, India

Certifications

JULY 2022 - PRESENT

Web Development Bootcamp 2022

Udemy

MARCH 2020 - PRESENT

Crash Course on Python

Coursera

Skills

Libraries/APIs

REST APIs, PyTorch, TensorFlow, Node.js, React, OpenAI API

Tools

Claude, AI Prompts, Azure OpenAI Service, Git, MongoDB Atlas

Languages

Python 3, Python, Regex, JavaScript, SQL, Bash, HTML, CSS

Frameworks

Flask, Streamlit, Next.js

Paradigms

REST, DevOps

Platforms

AWS IoT, Docker, Blockchain, Ollama, Azure, Linux

Storage

Redis, MongoDB, NoSQL

Other

Large Language Models (LLMs), Chatbots, Agentic AI, Retrieval-augmented Generation (RAG), Machine Learning, Deep Learning, FastAPI, Llama, LLVM, Open-source LLMs, Vector Data, Qdrant, Pinecone, FAISS, Weaviate, Containerization, Large Language Model Operations (LLMOps), Computer Vision, OpenAI, Anthropic, Gemini, LangChain, Prompt Engineering, Artificial Intelligence (AI), Blockchain Development, Blockchain Design, ChromaDB, Security, Shell Scripting, Speech Synthesis, Computer Vision Algorithms, AI Chatbots, Speech Recognition

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring