Gaurav Singh
Verified Expert in Engineering
Machine Learning Engineer and Developer
Gaurav is a talented machine learning and NLP scientist with a Ph.D. from University College London. Gaurav's research focused on information extraction from unstructured text using deep learning, mainly under scarce training data constraints. He sped up convergence in training deep neural networks, improving generalization and robustness to adversarial noise, and developed an automated approach for finding promising materials from the scientific literature for making energy devices.
Portfolio
Experience
Availability
Preferred Environment
Amazon Web Services (AWS), Deep Learning, NumPy, Git, Pandas, Scikit-learn, PyTorch, Python 3, Azure, SQL
The most amazing...
...thing I've developed as an ML/NLP scientist is an automated approach for finding promising materials from the scientific literature for making energy devices.
Work Experience
Deep Learning LLM Scientist and Deployment Specialist
OctoML
- Audited the platform that was built to allow people to use LLMs—both public and private—via easy-to-use and quick-to-set-up APIs.
- Tracked bugs and made a report to inform the company to get them fixed.
- Tested various features of the Octoml.ai website.
SageMaker Expert
SimpliCapital LLC
- Analyzed the problem and developed a strategy for solving the problem under given constraints for the customer. Clarified the problem and fixed major issues in the company's previous solution.
- Built a new state-of-the-art after extensive experimentation that performed as per the expectations of the customer. Worked with the engineering team to fix the AWS architectural issues so the model could work without significant delays in prod.
- Prepared the results to be presented to the investors of the company.
Lead Data Scientist
Binance
- Built a machine learning-based system to extract information from users' uploaded ID images to perform cheaper and faster KYC.
- Developed a social media monitoring system that could detect upcoming trends, identify and summarize customer feedback, create alerts for customer complaints, and identify new coins that are getting attention from users, etc.
- Built a fraud smart contract detection system based on the code and external factors such as the outflow and inflow of money into the contract, the website and the promised return, and the reputation of founders on social media, etc.
Data Scientist
Helium Billboard Partners, LLC
- Developed a machine learning model to identify where to deploy a given helium Node.js to maximize the payout given the various resource constraints.
- Built a pipeline to extract useful information from blockchain and various other sources, such as Google Maps API for geolocation data.
- Presented the results in weekly meetings to management and others.
Applied Scientist
Amazon UK
- Worked on information extraction from structured and semi-structured sources on the web to populate the KG of Alexa via automation.
- Built and published state-of-the-art approaches for superior information extraction from web tables and aligning them to our knowledge graph.
- Worked on and improved the semantic question understanding and aggregate fact generation for Alexa.
Senior NLP Research Scientist
MediaTek Research UK
- Developed an approach for natural language understanding on a device with various constraints such as memory and power.
- Developed algorithms for generating artificial data for training deep learning models that would otherwise require expensive and time-consuming labeled data collection processes.
- Created tools and scripts to allow easy model-training, graph plotting, and the transfer of scripts to GPU servers.
Senior Research Associate
Cochrane
- Created a state-of-the-art approach for identifying (biomedical) scientific papers that are useful for a systematic review from a long list with a high recall/precision.
- Built a state-of-the-art machine learning algorithm for tagging biomedical paper abstracts with labels denoting the PICO (population, intervention, outcome) characteristics of the trial described in the paper.
- Developed APIs in Flask and Python to provide the SD teams at IoE-UCL and Cochrane to use SOTA text classification models in their workflow.
Researcher
Yahoo! Labs
- Developed a new machine learning algorithm for user profile completion for inactive users with sparse user profiles using yahoo-news and yahoo-videos.
- Improved news and video recommendation for cold-start users i.e., users that have liked or disliked very few items, with cutting edge state-of-the-art recommendation system algorithms.
- Developed an approach for zero-shot (unseen) text classification to apply never-before-seen tags to URLs for bookmarking based on the contents of the webpage hosted at the URL.
Software Engineer
vwo.com
- Served as a full-stack developer on building the UI and backend of the WYSWYG website editing tool.
- Implemented data mining techniques in Python to extract insights from user session data such as user-session clustering and pattern mining.
- Created a new knowledge base for the company to reduce customer support requirements. Performed customer support for clients.
Experience
Relation Extraction using Explicit Context Conditioning
https://arxiv.org/abs/1902.09271Constructing Artificial Data for Fine-tuning for Low-resource Biomedical Text Tagging
https://arxiv.org/abs/1910.09255Structured Multi-label Biomedical Text Tagging via Attentive Neural Tree Decoding
https://arxiv.org/abs/1810.01468Skills
Languages
Python 3, Python, SQL, Python 2, JavaScript, C++, PHP, Snowflake
Libraries/APIs
PyTorch, Scikit-learn, Matplotlib, NumPy, PySpark, Pandas
Paradigms
Data Science, Agile, Management, Compiler Design, Object-oriented Programming (OOP)
Platforms
Linux, Jupyter Notebook, Amazon Web Services (AWS), Azure, Amazon EC2, Blockchain
Storage
Database Management Systems (DBMS), SQLite, Amazon S3 (AWS S3), Databases
Other
Deep Learning, Natural Language Understanding (NLU), Natural Language Processing (NLP), Scientific Data Analysis, Machine Learning, Statistics, Data Mining, Information Retrieval, Recommendation Systems, Artificial Intelligence (AI), GPT, Generative Pre-trained Transformers (GPT), Machine Learning Operations (MLOps), Data Modeling, Model Development, Amazon Machine Learning, Team Leadership, Predictive Modeling, Leadership, Software Development, Web Programming, Algorithms, Data Structures, NLU, Solution Architecture, Cloud, Computer Vision, CI/CD Pipelines, Statistical Modeling, Parallelization, Large Data Sets, Feature Engineering, Regression Modeling, Classification Algorithms, Big Data, Data Visualization, Data Analysis, Data Analytics, Geospatial Analytics
Frameworks
Flask, Hadoop, Spark
Tools
Git, MATLAB, Amazon SageMaker
Education
Ph.D. in Natural Language Processing
University College London - London, England
Master's Degree in Machine Learning
Pierre and Marie Curie University - Paris
Engineer's Degree in Computer Science
Delhi College of Engineering - Delhi
Certifications
Architecting on AWS
Amazon Web Services
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring