Aditya Andra
Verified Expert in Engineering
Analyst and Developer
Hyderabad, Telangana, India
Toptal member since June 8, 2020
Aditya is a developer with experience building machine learning and statistical models with large-scale data sets on cloud platforms using the latest big data technologies. Thanks to master's degrees from the IE Business School and IIT (ISM) Dhanbad, Aditya has a solid understanding of data science in various business scenarios. He is also a former quantitative researcher specializing in time-series and machine learning-based strategies and risk models in financial markets.
Portfolio
Experience
Availability
Preferred Environment
Machine Learning, Python, Git, Jupyter, Data Science
The most amazing...
...project I've developed is building a custom trial optimization model for a pharmaceutical company which outperformed all existing ML models.
Work Experience
Senior Data Scientist
Novo Nordisk
- Built time series forecasting models using SOTA deep learning algorithms like N-HiTS and N-BEATS, which outperformed traditional ARIMA and Holt-Winters ES models.
- Built a proprietary trial optimization algorithm to predict the end date of trials, which outperformed all the time series models.
- Built ensemble models for demand and sales forecasting.
Machine Learning Developer
Zvoid
- Created a tweet listener capable of listening to the tweets from a given list of authors and making the data ready for the decision engine.
- Built the automated trading capacity using the Alpaca API.
- Developed the end-end analysis of a particular Twitter IPO hypothesis.
- Worked on the decision engine using a random forest regressor that accepts the tweet and the stock price and gives out a stock buying or selling recommendation.
Senior Data Scientist
COGNIZER AI
- Developed a BERT-based conversational AI solution based on business requirements.
- Converted natural language queries into SQL queries using BERT-based deep-learning architecture.
- Contributed to significant parts of the back-end flow and took ownership of those flows.
- Extracted various fields from contract PDFs using regex and deep learning models and optimized the models to increase processing speed using TensorRT.
- Put the DL models into production using APIs and Docker. Used AWS and GCP to enable autoscaling features.
Data Scientist | Researcher
Freelance
- Built data pipelines for data coming from multiple sources like the Quandl API and a SQL database.
- Performed an exploratory data analysis on the built dataset, derived insights, and presented it to the stakeholders on Jupyter Notebook and Tableau.
- Modeled the data using decision tree-based regression models.
CTO
WiseLike
- Competed at the IE Business School's startup lab and won the investors' choice award and the most innovative project award.
- Developed the whole machine learning pipeline from scratch, starting with a web scraper for pictures, extracting properties of a picture, and training the model using the data.
- Served the model using a REST API (Flask) on the website wiselike.pythonanywhere.com.
- Performed A/B and hypothesis testing to test the validity of the model.
Quantitative Analyst
Futures First
- Performed an exploratory data analysis on large-scale financial datasets and derived insights that led to tradable strategies, using Python and visualizing data through dashboards in Tableau.
- Implemented a time series analysis (SARIMA and GARCH) of prices in commodity markets, considering CFTC reports and external factors like currency.
- Developed regression-based mean-reverting strategies in fixed-income markets of the US and Brazil.
- Deployed ETL pipelines and ML pipelines working on GCP.
- Performed backtesting and forward testing of strategies by tracking their Sharpe ratios.
- Performed hypothesis testing and evaluated the risk for strategies based on Monte Carlo simulations and historical value at risk.
- Built natural language pipelines to track news sentiment.
Research Intern
Next Sapiens
- Developed a novel 4D (degrees of freedom) solution for the simultaneous localization and mapping of an unmanned aerial vehicle to reduce the computation cost and published research on the same (Leeexplore.ieee.org/document/6461785).
- Combined location data from various sources like LIDAR, proximity sensors, inertial measurement units, and camera using extended Kalman filters to update the state information of the robot.
- Developed a fuzzy logic-based PID controller for the unmanned aerial vehicle to maintain stability during flight.
Experience
Churn Prediction for a Book Publisher
https://github.com/adia4/Churn-publisher/blob/master/datathon-final.ipynbStock Suggestions | Distributed System with PySpark
https://github.com/adia4/Financial-Analysis/blob/master/Spark-Financial_data_Analysis.ipynbWord Recommendation System for Movie and Series Reviews
SQL Database for North American Oil and Gas and Visualization through Tableau
Machine Learning Model to Suggest Better Pictures for Social Media
Generating Insights in Stock Market Data
Predicting the Probability of a Default of a Company to Make Loan Decisions
https://github.com/MBD-RiskandFraud/fintech_platform_ieLive Tweet Sentiment Tracking
Cancer Prediction Using VOC Data
https://github.com/adia4/voc_cancer_predictionVOC database with labeled cancer data. The results are deployed using a Flask API which predicts the kind of cancer based on the VOC content.
Sales Forecast Model for FMCG, Taking the COVID Scenario Into Account
Time Series Forecasting
End-to-end NLP Model Deployment
Built APIs to allow its interaction with external modules.
Dockerized the whole application.
Connected it with AWS and GCP solutions like Lambda, container registry, etc., to achieve autoscaling of the API.
Education
Accelerated General Management Program in General Management
IIM Ahmedabad - Ahmedabad
Master's Degree in Business Analytics and Big Data
IE Business School - Madrid, Spain
Bachelor of Technology Degree in Electrical Engineering
Indian Institute of Technology (ISM), Dhanbad - Dhanbad, India
Certifications
Data Engineering, Big Data, and Machine Learning on GCP Specialization
Coursera
Certification in Quantitative Finance
Fitch Learning
Skills
Libraries/APIs
Pandas, NumPy, Scikit-learn, Keras, REST APIs, Spark ML, TensorFlow, Natural Language Toolkit (NLTK), Spark Streaming, Bloomberg API, PySpark
Tools
ARIMA, Tableau, DataViz, Spark SQL, Jupyter, Git, MATLAB, Bloomberg, Reuters Eikon, Azure Machine Learning
Languages
Python, SQL, R, C++, Python 3, SAS, Embedded C, Excel VBA
Paradigms
Functional Programming, Quantitative Research, Object-oriented Programming (OOP), ETL, Linear Programming
Frameworks
Flask, Spark, Apache Spark
Platforms
Linux, Amazon Web Services (AWS), Windows, Amazon EC2, Apache Kafka, Google Cloud Platform (GCP), Docker, Pentaho, Jupyter Notebook, Databricks
Storage
MongoDB, Redshift, NoSQL, PostgreSQL, Azure SQL Databases, MySQL, Databases
Industry Expertise
Social Media
Other
Quantitative Modeling, Statistics, Finance, Time Series, Mathematics, Natural Language Processing (NLP), Financial Modeling, Machine Learning, Time Series Analysis, Risk Modeling, Automated Trading Software, Statistical Analysis, Data Science, Quantitative Analysis, Predictive Analytics, Data Analysis, Data Analytics, Statistical Modeling, Regression Modeling, Hypothesis Testing, Data Visualization, Forecasting, Deep Learning, Generative Pre-trained Transformers (GPT), Recommendation Systems, Multivariate Statistical Modeling, Data Modeling, Machine Learning Operations (MLOps), Data Warehousing, Algorithms, Dashboards, Web Scraping, Neural Networks, Computer Vision, Data Warehouse Design, Websites, Social Media Marketing (SMM), A/B Testing, Trading, Data Engineering, Big Data, APIs, Derivatives, Decision Trees, Signal Analysis, Custom BERT, Quantitative Finance, General Management, Supply Chain Optimization, Autoscaling, Gunicorn
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring