Senior Data Scientist
2022 - 2023Novo Nordisk- Built time series forecasting models using SOTA deep learning algorithms like N-HiTS and N-BEATS, which outperformed traditional ARIMA and Holt-Winters ES models.
- Built a proprietary trial optimization algorithm to predict the end date of trials, which outperformed all the time series models.
- Built ensemble models for demand and sales forecasting.
Technologies: Python, Deep Learning, Time Series, Machine Learning, Azure Machine Learning, Databricks, Supply Chain OptimizationMachine Learning Developer
2022 - 2022Zvoid- Created a tweet listener capable of listening to the tweets from a given list of authors and making the data ready for the decision engine.
- Built the automated trading capacity using the Alpaca API.
- Developed the end-end analysis of a particular Twitter IPO hypothesis.
- Worked on the decision engine using a random forest regressor that accepts the tweet and the stock price and gives out a stock buying or selling recommendation.
Technologies: Machine Learning, Python, Quantitative Modeling, Quantitative Finance, Data ScienceSenior Data Scientist
2020 - 2021COGNIZER AI- Developed a BERT-based conversational AI solution based on business requirements.
- Converted natural language queries into SQL queries using BERT-based deep-learning architecture.
- Contributed to significant parts of the back-end flow and took ownership of those flows.
- Extracted various fields from contract PDFs using regex and deep learning models and optimized the models to increase processing speed using TensorRT.
- Put the DL models into production using APIs and Docker. Used AWS and GCP to enable autoscaling features.
Technologies: Natural Language Processing (NLP), Custom BERT, APIs, Python 3, Google Cloud Platform (GCP), Deep Learning, Amazon Web Services (AWS), Machine Learning Operations (MLOps), Flask, REST APIs, Docker, AutoscalingData Scientist | Researcher
2020 - 2020Freelance- Built data pipelines for data coming from multiple sources like the Quandl API and a SQL database.
- Performed an exploratory data analysis on the built dataset, derived insights, and presented it to the stakeholders on Jupyter Notebook and Tableau.
- Modeled the data using decision tree-based regression models.
Technologies: Amazon Web Services (AWS), Tableau, Jupyter Notebook, Redshift, NumPy, Pandas, Python, Data Science, Data Analytics, Statistical Analysis, Machine Learning, Git, Docker, Amazon EC2, APIs, Natural Language Processing (NLP), PostgreSQL, Jupyter, Python 3CTO
2020 - 2020WiseLike- Competed at the IE Business School's startup lab and won the investors' choice award and the most innovative project award.
- Developed the whole machine learning pipeline from scratch, starting with a web scraper for pictures, extracting properties of a picture, and training the model using the data.
- Served the model using a REST API (Flask) on the website wiselike.pythonanywhere.com.
- Performed A/B and hypothesis testing to test the validity of the model.
Technologies: Deep Learning, Computer Vision, NumPy, Pandas, Python, Machine Learning, Social Media Marketing (SMM), Websites, Scikit-learn, FlaskQuantitative Analyst
2013 - 2019Futures First- Performed an exploratory data analysis on large-scale financial datasets and derived insights that led to tradable strategies, using Python and visualizing data through dashboards in Tableau.
- Implemented a time series analysis (SARIMA and GARCH) of prices in commodity markets, considering CFTC reports and external factors like currency.
- Developed regression-based mean-reverting strategies in fixed-income markets of the US and Brazil.
- Deployed ETL pipelines and ML pipelines working on GCP.
- Performed backtesting and forward testing of strategies by tracking their Sharpe ratios.
- Performed hypothesis testing and evaluated the risk for strategies based on Monte Carlo simulations and historical value at risk.
- Built natural language pipelines to track news sentiment.
Technologies: Google Cloud Platform (GCP), NumPy, Pandas, Python, Data Science, Data Analytics, Statistical Analysis, Machine Learning, Fixed-income Derivatives, Derivatives, Bloomberg API, Reuters Eikon, Git, Jupyter, Excel VBAResearch Intern
2012 - 2012Next Sapiens- Developed a novel 4D (degrees of freedom) solution for the simultaneous localization and mapping of an unmanned aerial vehicle to reduce the computation cost and published research on the same (Leeexplore.ieee.org/document/6461785).
- Combined location data from various sources like LIDAR, proximity sensors, inertial measurement units, and camera using extended Kalman filters to update the state information of the robot.
- Developed a fuzzy logic-based PID controller for the unmanned aerial vehicle to maintain stability during flight.
Technologies: Embedded C, C++, MATLAB