Gaurav Singh, Machine Learning Developer in London, United Kingdom
Gaurav Singh

Machine Learning Developer in London, United Kingdom

Member since October 7, 2020
Gaurav is a talented machine learning and NLP scientist with a Ph.D. from University College London. Gaurav's research focused on information extraction from unstructured text using deep learning, mainly under scarce training data constraints. He sped up convergence in training deep neural networks, improving generalization and robustness to adversarial noise, and developed an automated approach for finding promising materials from the scientific literature for making energy devices.
Gaurav is now available for hire


  • Binance
    Amazon SageMaker, Amazon EC2, Deep Learning...
  • Amazon UK
    Python 3, PyTorch, Machine Learning, Natural Language Processing (NLP)...
  • MediaTek Research UK
    Deep Learning, Natural Language Processing (NLP), PyTorch, Python 3



London, United Kingdom



Preferred Environment

Amazon Web Services (AWS), Deep Learning, NumPy, Git, Matplotlib, Pandas, Scikit-learn, PyTorch, Python 3, Linux

The most amazing...

...thing I've developed as an ML/NLP scientist is an automated approach for finding promising materials from the scientific literature for making energy devices.


  • Lead Data Scientist

    2022 - 2023
    • Built a machine learning-based system to extract information from users' uploaded ID images to perform cheaper and faster KYC.
    • Developed a social media monitoring system that could detect upcoming trends, identify and summarize customer feedback, create alerts for customer complaints, and identify new coins that are getting attention from users, etc.
    • Built a fraud smart contract detection system based on the code and external factors such as the outflow and inflow of money into the contract, the website and the promised return, and the reputation of founders on social media, etc.
    Technologies: Amazon SageMaker, Amazon EC2, Deep Learning, Natural Language Processing (NLP), Computer Vision, Predictive Modeling, Statistical Modeling
  • Applied Scientist

    2020 - 2022
    Amazon UK
    • Worked on information extraction from structured and semi-structured sources on the web to populate the KG of Alexa via automation.
    • Built and published state-of-the-art approaches for superior information extraction from web tables and aligning them to our knowledge graph.
    • Worked on and improved the semantic question understanding and aggregate fact generation for Alexa.
    Technologies: Python 3, PyTorch, Machine Learning, Natural Language Processing (NLP), CI/CD Pipelines
  • Senior NLP Research Scientist

    2019 - 2020
    MediaTek Research UK
    • Developed an approach for natural language understanding on a device with various constraints such as memory and power.
    • Developed algorithms for generating artificial data for training deep learning models that would otherwise require expensive and time-consuming labeled data collection processes.
    • Created tools and scripts to allow easy model-training, graph plotting, and the transfer of scripts to GPU servers.
    Technologies: Deep Learning, Natural Language Processing (NLP), PyTorch, Python 3
  • Senior Research Associate

    2017 - 2019
    • Created a state-of-the-art approach for identifying (biomedical) scientific papers that are useful for a systematic review from a long list with a high recall/precision.
    • Built a state-of-the-art machine learning algorithm for tagging biomedical paper abstracts with labels denoting the PICO (population, intervention, outcome) characteristics of the trial described in the paper.
    • Developed APIs in Flask and Python to provide the SD teams at IoE-UCL and Cochrane to use SOTA text classification models in their workflow.
    Technologies: Natural Language Processing (NLP), Deep Learning, PyTorch, Pandas, Flask, Python
  • Researcher

    2014 - 2015
    Yahoo! Labs
    • Developed a new machine learning algorithm for user profile completion for inactive users with sparse user profiles using yahoo-news and yahoo-videos.
    • Improved news and video recommendation for cold-start users i.e., users that have liked or disliked very few items, with cutting edge state-of-the-art recommendation system algorithms.
    • Developed an approach for zero-shot (unseen) text classification to apply never-before-seen tags to URLs for bookmarking based on the contents of the webpage hosted at the URL.
    Technologies: Recommendation Systems, Information Retrieval, MATLAB, Statistical Modeling
  • Software Engineer

    2011 - 2012
    • Served as a full-stack developer on building the UI and backend of the WYSWYG website editing tool.
    • Implemented data mining techniques in Python to extract insights from user session data such as user-session clustering and pattern mining.
    • Created a new knowledge base for the company to reduce customer support requirements. Performed customer support for clients.
    Technologies: Python, JavaScript, PHP


  • Relation Extraction using Explicit Context Conditioning

    Relation Extraction (RE) aims to label relations between groups of marked entities in raw text. Most current RE models learn context-aware representations of the target entities used to establish a relation between them. This works well for intra-sentence RE and we call them first-order relations. However, this methodology can sometimes fail to capture complex and long dependencies. To address this, we hypothesize that at times, two target entities can be explicitly connected via a context token. We refer to such indirect relations as second-order relations and describe an efficient implementation for computing them. These second-order relation scores are then combined with first-order relation scores. Our empirical results show that the proposed method leads to state-of-the-art performance over two biomedical datasets.

  • Constructing Artificial Data for Fine-tuning for Low-resource Biomedical Text Tagging

    Biomedical text tagging systems are plagued by the dearth of labeled training data. There have been recent attempts at using pre-trained encoders to deal with this issue. A pre-trained encoder provides a representation of the input text, which is then fed to task-specific layers for classification. The entire network is fine-tuned on the labeled data from the target task. Unfortunately, a low-resource biomedical task often has too few labeled instances for satisfactory fine-tuning. Also, if the label space is large, it contains few or no labeled instances for the majority of labels. Most biomedical tagging systems treat labels as indexes, ignoring the fact that these labels are often concepts expressed in natural language, e.g., `Appearance of a lesion on brain imaging.' To address these issues, we proposed constructing extra labeled instances using label-text (i.e., label's name) as input for the corresponding label-index (i.e., label's index). In fact, we proposed a number of strategies for manufacturing multiple artificial labeled instances from a single label.

  • Structured Multi-label Biomedical Text Tagging via Attentive Neural Tree Decoding

    We proposed a model for tagging unstructured texts with an arbitrary number of terms drawn from a tree-structured vocabulary (i.e., an ontology). We treated this as a special case of sequence-to-sequence learning. The decoder begins at the root node of an ontological tree and recursively elects to expand child nodes as a function of the input text, the current node, and the latent decoder state. In our experiments, the proposed method outperformed state-of-the-art approaches on the important task of automatically assigning MeSH terms to biomedical abstracts.


  • Languages

    Python 3, Python, SQL, Python 2, JavaScript, C++, PHP, Snowflake
  • Libraries/APIs

    PyTorch, Scikit-learn, Matplotlib, NumPy, Pandas
  • Paradigms

    Data Science, Agile, Management, Compiler Design, Object-oriented Programming (OOP)
  • Platforms

    Linux, Jupyter Notebook, Amazon Web Services (AWS), Amazon EC2
  • Storage

    Database Management Systems (DBMS), SQLite, Databases
  • Other

    Deep Learning, Natural Language Understanding (NLU), Natural Language Processing (NLP), Scientific Data Analysis, Machine Learning, Statistics, Data Mining, Information Retrieval, Recommendation Systems, Artificial Intelligence (AI), Team Leadership, Predictive Modeling, Software Development, Web Programming, Algorithms, Data Structures, NLU, Solution Architecture, Cloud, Computer Vision, CI/CD Pipelines, Statistical Modeling
  • Frameworks

    Flask, Hadoop, Spark
  • Tools

    Git, MATLAB, Amazon SageMaker


  • Ph.D. in Natural Language Processing
    2015 - 2019
    University College London - London, England
  • Master's Degree in Machine Learning
    2012 - 2014
    Pierre and Marie Curie University - Paris
  • Engineer's Degree in Computer Science
    2007 - 2011
    Delhi College of Engineering - Delhi


  • Architecting on AWS
    Amazon Web Services

To view more profiles

Join Toptal
Share it with others