Lead Data Scientist
2022 - 2023Binance- Built a machine learning-based system to extract information from users' uploaded ID images to perform cheaper and faster KYC.
- Developed a social media monitoring system that could detect upcoming trends, identify and summarize customer feedback, create alerts for customer complaints, and identify new coins that are getting attention from users, etc.
- Built a fraud smart contract detection system based on the code and external factors such as the outflow and inflow of money into the contract, the website and the promised return, and the reputation of founders on social media, etc.
Technologies: Amazon SageMaker, Amazon EC2, Deep Learning, Natural Language Processing (NLP), Computer Vision, Predictive Modeling, Statistical ModelingApplied Scientist
2020 - 2022Amazon UK- Worked on information extraction from structured and semi-structured sources on the web to populate the KG of Alexa via automation.
- Built and published state-of-the-art approaches for superior information extraction from web tables and aligning them to our knowledge graph.
- Worked on and improved the semantic question understanding and aggregate fact generation for Alexa.
Technologies: Python 3, PyTorch, Machine Learning, Natural Language Processing (NLP), CI/CD PipelinesSenior NLP Research Scientist
2019 - 2020MediaTek Research UK- Developed an approach for natural language understanding on a device with various constraints such as memory and power.
- Developed algorithms for generating artificial data for training deep learning models that would otherwise require expensive and time-consuming labeled data collection processes.
- Created tools and scripts to allow easy model-training, graph plotting, and the transfer of scripts to GPU servers.
Technologies: Deep Learning, Natural Language Processing (NLP), PyTorch, Python 3Senior Research Associate
2017 - 2019Cochrane- Created a state-of-the-art approach for identifying (biomedical) scientific papers that are useful for a systematic review from a long list with a high recall/precision.
- Built a state-of-the-art machine learning algorithm for tagging biomedical paper abstracts with labels denoting the PICO (population, intervention, outcome) characteristics of the trial described in the paper.
- Developed APIs in Flask and Python to provide the SD teams at IoE-UCL and Cochrane to use SOTA text classification models in their workflow.
Technologies: Natural Language Processing (NLP), Deep Learning, PyTorch, Pandas, Flask, PythonResearcher
2014 - 2015Yahoo! Labs- Developed a new machine learning algorithm for user profile completion for inactive users with sparse user profiles using yahoo-news and yahoo-videos.
- Improved news and video recommendation for cold-start users i.e., users that have liked or disliked very few items, with cutting edge state-of-the-art recommendation system algorithms.
- Developed an approach for zero-shot (unseen) text classification to apply never-before-seen tags to URLs for bookmarking based on the contents of the webpage hosted at the URL.
Technologies: Recommendation Systems, Information Retrieval, MATLAB, Statistical ModelingSoftware Engineer
2011 - 2012vwo.com- Served as a full-stack developer on building the UI and backend of the WYSWYG website editing tool.
- Implemented data mining techniques in Python to extract insights from user session data such as user-session clustering and pattern mining.
- Created a new knowledge base for the company to reduce customer support requirements. Performed customer support for clients.
Technologies: Python, JavaScript, PHP