Chief Data Scientist2018 - PRESENTDatakalp, LLP
Technologies: Deep Learning, Signal Processing, Python, R, PyTorch, TensorFlow, Keras, C, C++
- Built a news article classification and recommendation algorithm based on deep neural network (DNN) and vector-space models.
- Designed and implemented a machine learning algorithm that takes a document as the input, and retrieves duplicate documents as well as related (but not duplicate) documents from a given corpus.
- Developed image segmentation algorithms using Deep U-Net and ResNet blocks that identify areas of mineral deposits from a given seismographic radar image.
- Designed the deep learning architecture for optimizing product yield without any hardware change for a manufacturing unit.
- Wrote a masked R-CNN based machine learning algorithm for analyzing and detecting various attributes of people in a video.
- Built a deep learning-based algorithm to extract information from scanned invoices. No templates were necessary. Without using proprietary third-party tools. Using Python, TensorFlow, and OpenCV.
- Developed an algorithm that tells if one should see a doctor based on a picture of one's skin taken from a commodity mobile phone camera using Python, TensorFlow.
Director of Data Science2016 - 2018Tally Analytics, Pvt Ltd | ClustrData
Technologies: Python, Docker, R
- Architected a scalable data-mining system for building a product catalog for small businesses using text mning, named-entity-recognition and deep-graph reasoning.
- Planned and coordinated the work of a cross-functional team building a unique suite of machine-learning based products for small businesses.
- Developed algorithms to detect and auto-suggest corrections of typographical mistakes in manually entered data for various processes (billing, tax return filing, and so on).
- Guided the development of a deep-learning model to automatically mine aspect-based sentiment from reviews. For example. the aspects could be the ambiance, food taste, price, and so on in restaurant reviews; or it could be the story, direction, acting, and so on for a movie.
- Prototyped a text-mining system to automatically categorize the support requests coming from a vast diversity of customers and implement route them to appropriate support personnel or to send a canned response.
Manager and Lead—Data Science2014 - 2016Bidgely Technologies, Pvt Ltd
Technologies: MATLAB, Python, R, Signal Processing, Pattern Recognition, Machine Learning
- Guided and supervised the development of the company's proprietary technology of disaggregation. Disaggregation is about taking whole-house energy consumption trace (with granularity ranging from one second to one hour), and without appliance level sub-meters, determining which appliance is consuming how much energy in a given home.
- Invented algorithms for detecting events in a home-based on whole-house energy consumption.
- Guided the development of algorithms to estimate the solar-power generation for a given home based on weather data at the geographic location and satellite image of the particular home.
Lead Scientist2010 - 2014GE Global Research
Technologies: R, Python, MATLAB, C, C++, Statistical Signal Processing, Machine Learning, Pattern Recognition
- Invented a novel algorithm for vocabulary compression that increased the efficiency of the downstream processing of the text corpora by machine learning algorithms; also patented.
- Led the development of a novel machine learning algorithm that makes alarms and beeps in an intensive care unit (ICU) more relevant by learning from past data; partially published and patented.
- Built a suite of algorithms for dynamically optimizing the control logic of a wind-turbine farm so that the power generated is maximized, the wear and tear are minimized, and the noise produced is well within the regulated upper limit; partially patented.
- Constructed algorithm prototypes to identify various objects in a CT scan and measure their shape and size. It was deployed in deep-learning based image segmentation techniques.
- Developed scalable algorithms for summarizing a large corpus of unstructured text documents with heavy domain-specific jargon. For example, one such corpus was a set of around a million email chains capturing emails between the customer, field engineer, subject matter experts, and so on during the maintenance, service, and repair events on a big steam turbine.
Research Scientist2007 - 2010Zargis Medical
Technologies: MATLAB, Python, Signal Processing, Pattern Recognition, Machine Learning
- Built the company's proprietary algorithms to detect heart sounds from a stethoscope (audio waveform) and aid doctors in their diagnoses.
- Built fourier transform and wavelet transform based algorithms for preprocessing the acoustic waveform from a stethoscope and transforming the signal into a representation that is suitable for machine learning.
- Built machine learning algorithms to differentiate the subtle third heart sound (S3) from other murmur related sounds.
- Built deep time-delay neural network toolkit for extracting events from multiple time-series data.
- Translated core algorithms from Matlab to production-grade C++.
Graduate Research Assistant2002 - 2007Computer Science Department of University of North Carolina at Chapel Hill
Technologies: MATLAB, C, C++
- Assisted in the development of a new microscope for use by physicists. Used linear algebra, signal processing, mathematical modeling, and control theory to develop a novel high-resolution laser-interferometry-based tracking system.
- Developed a mathematical model of a 3D magnetic force exertion system to determine the directions of strong forces and validated the model using actual data.