Kalpit Desai, PhD, Developer in Bengaluru, Karnataka, India
Kalpit is available for hire
Hire Kalpit

Kalpit Desai, PhD

Verified Expert  in Engineering

Machine Learning Developer

Location
Bengaluru, Karnataka, India
Toptal Member Since
November 19, 2018

Kalpit is a developer with a PhD and over 16 years of experience working with large corporations and startups in machine learning and AI. He has a practiced hand with Python, R, and MATLAB and is known to devise the best data strategies to mine business value with deep learning technologies. Kalpit also specializes in computer vision, time-series analytics, dynamic system modeling, text mining, and industrial process optimization.

Portfolio

Datakalp
C++, C, Keras, TensorFlow, PyTorch, R, Python, Signal Processing, Deep Learning...
Clustr
R, Docker, Python, Data Science, Machine Learning, Statistics, Algorithms...
Bidgely
Machine Learning, Pattern Recognition, Signal Processing, R, Python, MATLAB...

Experience

Availability

Part-time

Preferred Environment

PyTorch, Keras, TensorFlow, PyCharm, Linux, Data Science, Computer Vision, Sensor Data Pattern Recognition, Python, Scikit-learn, NVIDIA CUDA

The most amazing...

...thing I built by using computer vision was a system to estimate the proportion of liquid, semisolid, and gas in a mixture flowing through a pipe in real time.

Work Experience

Chief Data Scientist

2018 - PRESENT
Datakalp
  • Built a news article classification and recommendation algorithm based on deep neural network (DNN) and vector-space models.
  • Designed and implemented a machine learning algorithm that takes a document as the input, and retrieves duplicate documents as well as related (but not duplicate) documents from a given corpus.
  • Developed image segmentation algorithms using Deep U-Net and ResNet blocks that identify areas of mineral deposits from a given seismographic radar image.
  • Designed the deep learning architecture for optimizing product yield without any hardware change for a manufacturing unit.
  • Wrote a masked R-CNN-based machine learning algorithm for analyzing and detecting various attributes of people in a video.
  • Built a deep learning-based algorithm to extract information from scanned invoices where no templates were necessary, without using proprietary third-party tools. Used Python, TensorFlow, and OpenCV.
  • Developed an algorithm that tells if one should see a doctor based on a picture of one's skin taken from a commodity mobile phone camera using Python and TensorFlow.
  • Developed a system that attaches to a commodity camera/existing CCTV and detects in real-time if a person enters a door without wearing a proper PPE kit (face mask | helmet | safety goggles | gloves | safety shoes).
  • Developed an algorithm for hand-gesture recognition under challenging lighting and contrast conditions.
  • Developed an algorithm to detect if glass bottles moving on a high-speed conveyor belt in a beverage manufacturing plant have a crack.
Technologies: C++, C, Keras, TensorFlow, PyTorch, R, Python, Signal Processing, Deep Learning, Data Science, Machine Learning, Computer Vision, Sensor Data Pattern Recognition, Predictive Analytics, Architecture, Data Modeling, Hugging Face, Artificial Intelligence (AI), Computer Vision Algorithms, OpenCV, Image Recognition, Research, Audits, Strategy, Product Consultant, Natural Language Processing (NLP), Scikit-learn, Speech Synthesis, Text to Speech (TTS), Recommendation Systems, LangChain, Large Language Models (LLMs), PostgreSQL, Recurrent Neural Networks (RNNs), Amazon Web Services (AWS), Neural Networks, Amazon SageMaker, Machine Learning Operations (MLOps), Startups, Health, GPT, Generative Pre-trained Transformers (GPT), Language Models, Data Analysis, Integration, Web Development, APIs, API Integration, Biometrics, Reinforcement Learning, Time Series, Deep Reinforcement Learning, MySQL, Statistical Analysis, Convolutional Neural Networks (CNN), Data Engineering, Google Ads, Image Processing, Image Analysis, BERT, Google Vision API, OpenAI API, NVIDIA CUDA, Jetson TX2, Predictive Modeling, Microsoft Power BI, Models, Writing & Editing, Generative Artificial Intelligence (GenAI), Audio Processing, Quotations, ChatGPT, Chatbots, Automation, WhatsApp, B2B, You Only Look Once (YOLO), K-nearest Neighbors (KNN)

Director of Data Science

2016 - 2018
Clustr
  • Architected a scalable data-mining system for building a product catalog for small businesses using text mining, named-entity-recognition, and deep-graph reasoning.
  • Planned and coordinated the work of a cross-functional team building a unique suite of machine-learning-based products for small businesses.
  • Developed algorithms to detect and auto-suggest corrections of typographical mistakes in manually entered data for various processes (billing, tax return filing, and so on).
  • Guided the development of a deep-learning model to automatically mine aspect-based sentiment from reviews. The aspects could be the ambiance, food taste, price, and so on for a restaurant; or it could be the story, direction, acting.
  • Prototyped a text-mining system to automatically categorize the support requests coming from a vast diversity of customers and implement route them to appropriate support personnel or to send a canned response.
Technologies: R, Docker, Python, Data Science, Machine Learning, Statistics, Algorithms, Deep Learning, Predictive Analytics, Architecture, Data Modeling, Artificial Intelligence (AI), Computer Vision Algorithms, OpenCV, Research, Strategy, Product Consultant, Natural Language Processing (NLP), Scikit-learn, Text to Speech (TTS), Recommendation Systems, PostgreSQL, Recurrent Neural Networks (RNNs), Amazon Web Services (AWS), Neural Networks, Amazon SageMaker, Machine Learning Operations (MLOps), Startups, Language Models, Data Analysis, Integration, Web Development, APIs, API Integration, Reinforcement Learning, MySQL, Statistical Analysis, Convolutional Neural Networks (CNN), Data Engineering, Image Processing, Image Analysis, BERT, NVIDIA CUDA, Predictive Modeling, Microsoft Power BI, Models, Writing & Editing, Quotations, Chatbots, B2B, You Only Look Once (YOLO), K-nearest Neighbors (KNN)

Manager/Lead | Data Science

2014 - 2016
Bidgely
  • Guided and supervised the development of the company's proprietary technology of "disaggregation"—that is, to determine which appliance is using how much energy in a given home, purely through algorithms looking at the whole-house energy meter trace.
  • Invented algorithms for detecting events in a home based on whole-house energy consumption.
  • Guided the development of algorithms to estimate the solar-power generation for a given home based on weather data at the geographic location and satellite image of the particular home.
Technologies: Machine Learning, Pattern Recognition, Signal Processing, R, Python, MATLAB, Data Science, Sensor Data Pattern Recognition, Computer Vision, Predictive Analytics, Architecture, Data Modeling, Artificial Intelligence (AI), Optimization Algorithms, OpenCV, Image Recognition, Research, Scikit-learn, Recommendation Systems, Amazon Web Services (AWS), Neural Networks, Machine Learning Operations (MLOps), Startups, Data Analysis, Integration, APIs, API Integration, Time Series, MySQL, Statistical Analysis, Natural Language Processing (NLP), Image Processing, Image Analysis, Predictive Modeling, Microsoft Power BI, Models, Writing & Editing, Automation, B2B, K-nearest Neighbors (KNN)

Lead Scientist

2010 - 2014
GE Global Research
  • Invented a novel algorithm for vocabulary compression that increased the efficiency of the downstream processing of the text corpora by machine learning algorithms; also patented.
  • Led the development of a novel machine learning algorithm that makes alarms and beeps in an intensive care unit (ICU) more relevant by learning from past data; partially published and patented.
  • Built a suite of algorithms for dynamically optimizing the control logic of a wind-turbine farm so that the power generated is maximized, the wear and tear are minimized, and the noise produced is well within the regulated upper limit; partially patented.
  • Constructed algorithm prototypes to identify various objects in a CT scan and measure their shape and size. It was deployed in deep-learning based image segmentation techniques.
  • Developed scalable algorithms for summarizing a large corpus of unstructured text documents with heavy domain-specific jargon. For example, one such corpus was a set of around a million email chains capturing emails between the customer, field engineer, subject matter experts, and so on during the maintenance, service, and repair events on a big steam turbine.
Technologies: Pattern Recognition, Machine Learning, Signal Processing, C++, C, MATLAB, Python, R, Data Science, Deep Learning, Sensor Data Pattern Recognition, Statistics, Predictive Analytics, Architecture, Artificial Intelligence (AI), Layout, Optimization Algorithms, Research, Natural Language Processing (NLP), Scikit-learn, Recommendation Systems, PostgreSQL, Recurrent Neural Networks (RNNs), Neural Networks, Health, Data Analysis, Biometrics, Reinforcement Learning, Time Series, MySQL, Statistical Analysis, Medical Diagnostics, Image Processing, Image Analysis, Predictive Modeling, Models, Writing & Editing, Automation, K-nearest Neighbors (KNN)

Research Scientist

2007 - 2010
Zargis Medical
  • Built the company's proprietary algorithms to detect heart sounds from a stethoscope (audio waveform) and aid doctors in their diagnoses.
  • Built fourier transform and wavelet transform based algorithms for preprocessing the acoustic waveform from a stethoscope and transforming the signal into a representation that is suitable for machine learning.
  • Built machine learning algorithms to differentiate the subtle third heart sound (S3) from other murmur related sounds.
  • Built deep time-delay neural network toolkit for extracting events from multiple time-series data.
  • Translated core algorithms from Matlab to production-grade C++.
Technologies: Machine Learning, Pattern Recognition, Signal Processing, Python, MATLAB, Data Science, Sensor Data Pattern Recognition, Predictive Analytics, Artificial Intelligence (AI), Research, Neural Networks, Startups, Health, Data Analysis, Medical Diagnostics, Image Analysis, Models, Writing & Editing, Audio Processing, Chatbots, K-nearest Neighbors (KNN)

Graduate Research Assistant

2002 - 2007
Computer Science Department of University of North Carolina at Chapel Hill
  • Assisted in the development of a new microscope for use by physicists. Used linear algebra, signal processing, mathematical modeling, and control theory to develop a novel high-resolution laser-interferometry-based tracking system.
  • Developed a mathematical model of a 3D magnetic force exertion system to determine the directions of strong forces and validated the model using actual data.
Technologies: C++, C, MATLAB, Sensor Data Pattern Recognition, Statistics, Research, Data Analysis, Biometrics, Time Series, Medical Diagnostics, Image Analysis

Toolkit for Building Time-delayed Neural Networks (Octave, MATLAB)

https://github.com/kvdesai/tdnn-toolkit
I wrote a complete, self-contained toolkit for constructing, configuring, training and applying time-delayed neural networks on time-series data.

Over the years, the toolkit has been a powerful resource for event detection from a variety of temporal signals—phonocardiograph, seismograph, power consumption, patient-monitoring devices in an ICU, and so on. I recently released the toolkit to a public domain on GitHub.

Workshop on Topic Modeling

https://www.slideshare.net/slideshow/embed_code/key/4jFQiYAp8o9tZV
I conducted a full-day workshop on topic modeling from an arbitrary corpus of text with the attendees having five to ten years of experience in machine learning. The workshop covered both the theory (which you see at the link) in sufficient detail including the mathematical formulations as well as practice by applying the techniques learned for two distinct corpora.

Optimal Control of Wind Turbine Farms

https://patents.google.com/patent/US20160032894
I developed novel algorithms that optimize the control logic of various wind turbines operating in a wind turbine farm. The approach included building a mathematical model of the dynamics of wind due to wake effects and also modeling the behavior of each turbine by predicting power that would be generated by the turbine as a function of the wind speed, wind direction, air temperature, air humidity, blade angles, rotor speed, yaw, pitch, roll, and so on. The algorithms were also patented.

Hemodynamic Impact-based Prioritization of Ventricular Tachycardia Alarms

https://ieeexplore.ieee.org/document/6944366
Ventricular tachycardia (V-tach) is a very serious condition that occurs when the ventricles are driven at high rates. However, almost half of the V-tach alarms declared through the processing of patterns observed in electrocardiography are not clinically actionable. The focus of this project was to provide guidance on determining whether a technically-correct V-tach alarm is clinically-actionable by determining its “hemodynamic impact.” A predictive, supervised machine-learning approach based on conditional inference trees was employed to determine the hemodynamic impact of a V-tach alarm.

Event Detection in a Home

https://patents.google.com/patent/US20180231603A1/
I built novel algorithms for detecting events in a home from energy profile data and energy waveforms for the home. The algorithms leverage pattern recognition, statistics, and machine learning. The algorithms are also patented.

Patented the Algorithm for Hemodynamic Impact-based Prioritization of Ventricular Tachycardia Alarms

https://patents.google.com/patent/US20150238151A1
The algorithm is also patented and here's the link.

The Design and Architecture of a Scalable Machine Learning Pipeline to Build a Product Catalog

The goal of this project was to build a system that can create and update a dynamic catalog of products based on information coming from a diverse variety of sources and in multiple languages (e.g., product masters coming from distributors, data from small business transactions, product reviews, and so on). The system heavily leveraged machine learning algorithms for deciphering a noisy textual mention of a product and anchoring it onto a known entity in the catalog. The system also leveraged a good dose of data engineering to ensure scalability and robustness.

Award-winning Algorithm for the Wikipedia Challenge

https://github.com/kvdesai/wikipedia-challenge
This was a real-world prediction problem floated as IEEE International Conference for Data Mining (ICDM) 2011 contest. The goal was to predict how many edits a Wikipedia editor will make in the next six months, based on past data.

This ensemble model won us the honorable mention prize in the contest. The code is written in R and Python.

Here is the related paper explaining the mathematics behind the approach.
https://arxiv.org/ftp/arxiv/papers/1405/1405.7393.pdf

Related Paper to the Algorithm for the Wikipedia Challenge

https://arxiv.org/ftp/arxiv/papers/1405/1405.7393.pdf
Here is the related paper explaining the mathematics behind the approach for my algorithm for the Wikipedia Challenge.

Video-based Defect Detection on Automobile Silencers

The client was an automobile silencer manufacturer. I built a novel algorithm to detect if a silencer that was just manufactured had any defect, based on multi-camera video feed looking at the silencer. The algorithm was deployed on the Nvidia Jetson Nano Edge Computing platform.

Technologies: Computer Vision, Object Detection, Nvidia Jetson, TensorRT, Python, C++

Languages

Python, SQL, C, R, C++

Libraries/APIs

OpenCV, Scikit-learn, TensorFlow, Keras, Natural Language Toolkit (NLTK), Spark ML, PyTorch, Google Vision API

Tools

MATLAB, You Only Look Once (YOLO), NVIDIA Jetson, Jetson TX2, PyCharm, Amazon SageMaker, Microsoft Power BI

Paradigms

Data Science, Automation, B2B

Platforms

NVIDIA CUDA, Amazon Web Services (AWS), Google Cloud Platform (GCP), Docker, Linux, Raspberry Pi

Other

Predictive Modeling, Mathematics, Probability Theory, Probabilistic Graphical Models, Computer Vision, Artificial Intelligence (AI), Time Series Analysis, Pattern Recognition, Text Mining, Signal Processing, Machine Learning, Sensor Data Pattern Recognition, Natural Language Processing (NLP), Predictive Analytics, Architecture, Computer Vision Algorithms, Image Recognition, Research, Neural Networks, Startups, Health, Data Analysis, Integration, Time Series, Statistical Analysis, Convolutional Neural Networks (CNN), Image Analysis, Models, Writing & Editing, K-nearest Neighbors (KNN), Leadership, OCR, System Architecture, Algorithms, Bayesian Statistics, Statistics, Deep Neural Networks, Statistical Learning, Linear Algebra, Image Processing, Statistical Modeling, Data Modeling, Gated Recurrent Unit (GRU), Recurrent Neural Networks (RNNs), Dynamic Systems Modeling, Programming, Applied Mathematics, Deep Learning, Recommendation Systems, Natural Language Understanding (NLU), Natural Language Queries, Roadmaps, Technology, GPT, Generative Pre-trained Transformers (GPT), Optimization Algorithms, Audits, Strategy, Product Consultant, Speech Synthesis, Text to Speech (TTS), Machine Learning Operations (MLOps), Language Models, Web Development, APIs, API Integration, Medical Diagnostics, BERT, Generative Artificial Intelligence (GenAI), Audio Processing, Quotations, ChatGPT, Chatbots, PDF, Google BigQuery, University Teaching, Industrial Internet of Things (IIoT), Data Structures, GraphDB, Data Engineering, Reinforcement Learning, Genomics, Hugging Face, Layout, LangChain, Large Language Models (LLMs), Biometrics, Deep Reinforcement Learning, Electrical Design, Sensor Data, Microcontroller Programming, Google Ads, OpenAI API, WhatsApp

Storage

MySQL, Redis, MongoDB, Cassandra, PostgreSQL

2002 - 2007

PhD Degree in Biomedical Engineering and Computer Science

University of North Carolina at Chapel Hill - Chapel Hill, NC, USA

1998 - 2002

Bachelor's Degree in Electrical Engineering

Gujarat University, LD College of Engineering - Ahmedabad, Gujarat, India

JANUARY 2014 - PRESENT

Machine Learning

Stanford University via Coursera

JANUARY 2007 - JANUARY 2012

Certified Associate in Project Management

PMI | Project Management Institute

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring