Kalpit Desai, PhD
Verified Expert in Engineering
Machine Learning Developer
Kalpit is a developer with a PhD and over 16 years of experience working with large corporations and startups in machine learning and AI. He has a practiced hand with Python, R, and MATLAB and is known to devise the best data strategies to mine business value with deep learning technologies. Kalpit also specializes in computer vision, time-series analytics, dynamic system modeling, text mining, and industrial process optimization.
Portfolio
Experience
Availability
Preferred Environment
PyTorch, Keras, TensorFlow, PyCharm, Linux, Data Science, Computer Vision, Sensor Data Pattern Recognition, Python, Scikit-learn, NVIDIA CUDA
The most amazing...
...thing I built by using computer vision was a system to estimate the proportion of liquid, semisolid, and gas in a mixture flowing through a pipe in real time.
Work Experience
Chief Data Scientist
Datakalp
- Built a news article classification and recommendation algorithm based on deep neural network (DNN) and vector-space models.
- Designed and implemented a machine learning algorithm that takes a document as the input, and retrieves duplicate documents as well as related (but not duplicate) documents from a given corpus.
- Developed image segmentation algorithms using Deep U-Net and ResNet blocks that identify areas of mineral deposits from a given seismographic radar image.
- Designed the deep learning architecture for optimizing product yield without any hardware change for a manufacturing unit.
- Wrote a masked R-CNN-based machine learning algorithm for analyzing and detecting various attributes of people in a video.
- Built a deep learning-based algorithm to extract information from scanned invoices where no templates were necessary, without using proprietary third-party tools. Used Python, TensorFlow, and OpenCV.
- Developed an algorithm that tells if one should see a doctor based on a picture of one's skin taken from a commodity mobile phone camera using Python and TensorFlow.
- Developed a system that attaches to a commodity camera/existing CCTV and detects in real-time if a person enters a door without wearing a proper PPE kit (face mask | helmet | safety goggles | gloves | safety shoes).
- Developed an algorithm for hand-gesture recognition under challenging lighting and contrast conditions.
- Developed an algorithm to detect if glass bottles moving on a high-speed conveyor belt in a beverage manufacturing plant have a crack.
Director of Data Science
Clustr
- Architected a scalable data-mining system for building a product catalog for small businesses using text mining, named-entity-recognition, and deep-graph reasoning.
- Planned and coordinated the work of a cross-functional team building a unique suite of machine-learning-based products for small businesses.
- Developed algorithms to detect and auto-suggest corrections of typographical mistakes in manually entered data for various processes (billing, tax return filing, and so on).
- Guided the development of a deep-learning model to automatically mine aspect-based sentiment from reviews. The aspects could be the ambiance, food taste, price, and so on for a restaurant; or it could be the story, direction, acting.
- Prototyped a text-mining system to automatically categorize the support requests coming from a vast diversity of customers and implement route them to appropriate support personnel or to send a canned response.
Manager/Lead | Data Science
Bidgely
- Guided and supervised the development of the company's proprietary technology of "disaggregation"—that is, to determine which appliance is using how much energy in a given home, purely through algorithms looking at the whole-house energy meter trace.
- Invented algorithms for detecting events in a home based on whole-house energy consumption.
- Guided the development of algorithms to estimate the solar-power generation for a given home based on weather data at the geographic location and satellite image of the particular home.
Lead Scientist
GE Global Research
- Invented a novel algorithm for vocabulary compression that increased the efficiency of the downstream processing of the text corpora by machine learning algorithms; also patented.
- Led the development of a novel machine learning algorithm that makes alarms and beeps in an intensive care unit (ICU) more relevant by learning from past data; partially published and patented.
- Built a suite of algorithms for dynamically optimizing the control logic of a wind-turbine farm so that the power generated is maximized, the wear and tear are minimized, and the noise produced is well within the regulated upper limit; partially patented.
- Constructed algorithm prototypes to identify various objects in a CT scan and measure their shape and size. It was deployed in deep-learning based image segmentation techniques.
- Developed scalable algorithms for summarizing a large corpus of unstructured text documents with heavy domain-specific jargon. For example, one such corpus was a set of around a million email chains capturing emails between the customer, field engineer, subject matter experts, and so on during the maintenance, service, and repair events on a big steam turbine.
Research Scientist
Zargis Medical
- Built the company's proprietary algorithms to detect heart sounds from a stethoscope (audio waveform) and aid doctors in their diagnoses.
- Built fourier transform and wavelet transform based algorithms for preprocessing the acoustic waveform from a stethoscope and transforming the signal into a representation that is suitable for machine learning.
- Built machine learning algorithms to differentiate the subtle third heart sound (S3) from other murmur related sounds.
- Built deep time-delay neural network toolkit for extracting events from multiple time-series data.
- Translated core algorithms from Matlab to production-grade C++.
Graduate Research Assistant
Computer Science Department of University of North Carolina at Chapel Hill
- Assisted in the development of a new microscope for use by physicists. Used linear algebra, signal processing, mathematical modeling, and control theory to develop a novel high-resolution laser-interferometry-based tracking system.
- Developed a mathematical model of a 3D magnetic force exertion system to determine the directions of strong forces and validated the model using actual data.
Experience
Toolkit for Building Time-delayed Neural Networks (Octave, MATLAB)
https://github.com/kvdesai/tdnn-toolkitOver the years, the toolkit has been a powerful resource for event detection from a variety of temporal signals—phonocardiograph, seismograph, power consumption, patient-monitoring devices in an ICU, and so on. I recently released the toolkit to a public domain on GitHub.
Workshop on Topic Modeling
https://www.slideshare.net/slideshow/embed_code/key/4jFQiYAp8o9tZVOptimal Control of Wind Turbine Farms
https://patents.google.com/patent/US20160032894Hemodynamic Impact-based Prioritization of Ventricular Tachycardia Alarms
https://ieeexplore.ieee.org/document/6944366Event Detection in a Home
https://patents.google.com/patent/US20180231603A1/Patented the Algorithm for Hemodynamic Impact-based Prioritization of Ventricular Tachycardia Alarms
https://patents.google.com/patent/US20150238151A1The Design and Architecture of a Scalable Machine Learning Pipeline to Build a Product Catalog
Award-winning Algorithm for the Wikipedia Challenge
https://github.com/kvdesai/wikipedia-challengeThis ensemble model won us the honorable mention prize in the contest. The code is written in R and Python.
Here is the related paper explaining the mathematics behind the approach.
https://arxiv.org/ftp/arxiv/papers/1405/1405.7393.pdf
Related Paper to the Algorithm for the Wikipedia Challenge
https://arxiv.org/ftp/arxiv/papers/1405/1405.7393.pdfVideo-based Defect Detection on Automobile Silencers
Technologies: Computer Vision, Object Detection, Nvidia Jetson, TensorRT, Python, C++
Skills
Languages
Python, SQL, C, R, C++
Libraries/APIs
OpenCV, Scikit-learn, TensorFlow, Keras, Natural Language Toolkit (NLTK), Spark ML, PyTorch, Google Vision API
Tools
MATLAB, You Only Look Once (YOLO), NVIDIA Jetson, Jetson TX2, PyCharm, Amazon SageMaker, Microsoft Power BI
Paradigms
Data Science, Automation, B2B
Platforms
NVIDIA CUDA, Amazon Web Services (AWS), Google Cloud Platform (GCP), Docker, Linux, Raspberry Pi
Other
Predictive Modeling, Mathematics, Probability Theory, Probabilistic Graphical Models, Computer Vision, Artificial Intelligence (AI), Time Series Analysis, Pattern Recognition, Text Mining, Signal Processing, Machine Learning, Sensor Data Pattern Recognition, Natural Language Processing (NLP), Predictive Analytics, Architecture, Computer Vision Algorithms, Image Recognition, Research, Neural Networks, Startups, Health, Data Analysis, Integration, Time Series, Statistical Analysis, Convolutional Neural Networks (CNN), Image Analysis, Models, Writing & Editing, K-nearest Neighbors (KNN), Leadership, OCR, System Architecture, Algorithms, Bayesian Statistics, Statistics, Deep Neural Networks, Statistical Learning, Linear Algebra, Image Processing, Statistical Modeling, Data Modeling, Gated Recurrent Unit (GRU), Recurrent Neural Networks (RNNs), Dynamic Systems Modeling, Programming, Applied Mathematics, Deep Learning, Recommendation Systems, Natural Language Understanding (NLU), Natural Language Queries, Roadmaps, Technology, GPT, Generative Pre-trained Transformers (GPT), Optimization Algorithms, Audits, Strategy, Product Consultant, Speech Synthesis, Text to Speech (TTS), Machine Learning Operations (MLOps), Language Models, Web Development, APIs, API Integration, Medical Diagnostics, BERT, Generative Artificial Intelligence (GenAI), Audio Processing, Quotations, ChatGPT, Chatbots, PDF, Google BigQuery, University Teaching, Industrial Internet of Things (IIoT), Data Structures, GraphDB, Data Engineering, Reinforcement Learning, Genomics, Hugging Face, Layout, LangChain, Large Language Models (LLMs), Biometrics, Deep Reinforcement Learning, Electrical Design, Sensor Data, Microcontroller Programming, Google Ads, OpenAI API, WhatsApp
Storage
MySQL, Redis, MongoDB, Cassandra, PostgreSQL
Education
PhD Degree in Biomedical Engineering and Computer Science
University of North Carolina at Chapel Hill - Chapel Hill, NC, USA
Bachelor's Degree in Electrical Engineering
Gujarat University, LD College of Engineering - Ahmedabad, Gujarat, India
Certifications
Machine Learning
Stanford University via Coursera
Certified Associate in Project Management
PMI | Project Management Institute
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring