Ashutosh Tripathi
Verified Expert in Engineering
Data Scientist and Developer
Yokohama, Kanagawa Prefecture, Japan
Toptal member since September 14, 2021
Ashutosh is a data scientist with over four years of experience in artificial intelligence, data analytics, and software development, specializing in natural language processing, computer vision, and time-series analytics. Some of the industries that Ashutosh has worked in include: advertising/marketing, insurance, IT, telecommunications, and software. Ashutosh has a demonstrated history of successfully developing intelligent solutions and deploying them in production.
Portfolio
Experience
Availability
Preferred Environment
Python 3, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Computer Vision, Time Series, Deep Learning, Artificial Intelligence (AI), Algorithms
The most amazing...
...projects: building an NLP engine for sentiment analysis, topic classification, and risk assessment and a solution to predict the network load in telecom cells.
Work Experience
Senior Machine Learning Engineer
Asurion
- Fetched relevant FAQs on the input text (a FAQ-based chatbot). Used a weighted combination of similarity metrics, like Euclidean distance, Cosine Similarity, and Word Mover distance and some custom metrics to determine the similarity between input and FAQ.
- Used text and other inputs to detect fraudulent claims in insurance (fraud detection), employing a supervised deep learning architecture to classify claims as fraudulent and non-fraudulent.
- Detected the sentiment of post images and just used more common text features to classify sentiment into positive, negative, and neutral (sentiment analysis with added image information).
Senior Data Scientist
Relativ*
- Developed a brand index, using NLP and SNS data, to evaluate performance in customer satisfaction, growth, revenue, profit, competitor, and so on; used Python 3, TensorFlow, Kera, scikit-learn, spaCy, NLTK, MeCab, and more.
- Constructed a Python 3 multi-touch attribution model that evaluates the performance of various campaigns, attributes, customer actions and generates a metric to evaluate their overall effectiveness.
- Implemented various statistical tests using Python 3 and several statistical libraries to prove or disprove various hypotheses relating to consumer behavior.
- Built a Python 3 tool to perform sentiment analyses, text classification, and risk evaluations on various data types, including SNS, store, customer support, and consumer's voice; used TensorFlow, Keras, scikit-learn, spaCy, NLTK, MeCab, and more.
- Used neural style transfer to design computer-generated posters for campaigns, using different themes; it was developed in Python 3, using TensorFlow and Keras.
AI Architect
Rakuten Mobile
- Built a natural language engine that classifies and analyzes text data specifically by performing sentiment analysis, classifying it into predefined categories, evaluating risk, and highlighting important topics being discussed.
- Created a capacity planning tool that does time series forecasting and regression analysis to predict loads on the telecom network; it predicts loads at a single cell cluster and network level and then recommends either to add new cells or remove extra cells.
- Developed a classification module to predict whether customer churn will occur.
- Built a real-time anomaly detection module based on time series; it detects real-time anomalous behavior in thousands of KPIs and raises alarms to stakeholders in case of anomaly detection.
- Developed a data science platform to perform data analytics, build AI solutions, and productizing AI applications.
Software Developer
Vital
- Designed and developed the complete back end using Django, Flask, PostgreSQL, and MongoDB.
- Developed a robust payment solution (a payment API) in Python 3 and using Flask; it provides solutions for 1-time payments, subscribed recurring payments, notifications and reminders.
- Integrated various third-party APIs which included payment gateways, insurance providers, fitness brands, pharmaceutical companies, and credit providers.
Data Scientist
Samsung Research
- Engineered a machine learning model—based on natural language processing and built using Python 3—to automatically assign new issues raised by the Q/A team to the right engineering team.
- Developed an Android application to parse logs from mobile devices; it was built using Android Studio and Java and based on 3GPP references.
- Made an Android application that can read and write the content of a USIM; it was built using Android Studio and Java and based on 3GPP references and the Android Telephony API.
Experience
Natural Language Engine (NLP)
The model is based on a deep recurrent neural network (bidirectional LSTM) and attention mechanism. The model performs three tasks. First, it assigns a sentiment characterization to the text, which can be negative, neutral, or positive. Second, It assigns a predefined category to the text (I cannot mention the categories due to privacy agreement). Third, it assigns a risk associated, i.e., whether it's low, medium, or high.
The model is then transformed into a REST API using Django and Flask API. Also, the model is connected to a PostgreSQL database and dashboard. The app is then wrapped inside a containerized Docker solution which is then placed in a Kubernetes cluster.
Capacity Forecasting Engine
The goal was to better prepare for the future. I was the project owner and worked on the project alone.
The model is combination of regression model and time series model. KPIs like data traffic and number of users are forecasted using combination of models like ARIMA, Holt-Winters, Facebook Prophet, and deep learning-based models. This is done for each individual cells.
We also built a regression model using recent data for predicting KPIs like PRB utilization and user throughput against KPIs used for forecasting. Using combination of these two forecasting and regression models, we calculated capacity. The model is then converted into an application using the REST API, Docker, and Kubernetes.
Book Review Sentiment Analysis
I built two models. The first dealt with reviews that were less than 250 words. The second was for reviews greater than 250 words. The first model was based on the XLNET transformer. A Hugging Face transformer library was used to build and fine-tune the model. The model gave results with an accuracy of 91%.
The second model was built using Bi-LSTM and with TensorFlow and Keras. The review was first tokenized using sentences and fed to LSTM as a time series data point. I then used Universal Sentence Encoder to get pre-trained sentence embedding. This model gave results with an accuracy of 90%.
Machine Translation
It uses an encoder and decoder. The encoder and decoder are based on bi-directional LSTM. It also uses a global attention mechanism to better learn long-range dependencies and focus on more relevant inputs from sentences. The model was built using TensorFlow and Keras.
LSIL Detection
The goal is to detect whether the cell is LSIL or normal. It's a binary classification task on image data. There were, in total, 10,000 labeled data which 50% of data was LSIL and 50% was normal. The model was built using Keras and TensorFlow ad a VGG 16 model was used for fine-tuning. The overall accuracy was 97%.
Education
Post Graduation Program in AI-ML in Artificial Intelligence
McCombs School of Business, University of Texas at Austin - Remote
Bachelor's Degree in Computer Science
Indian Institute of Technology, Patna - Patna, India
High School Diploma in Physics, Chemistry, and Math
Bethany Convent School - Allahabad, India
Certifications
AI-ML Post Graduate Certification.
McCombs School of Business, University of Texas at Austin
Mastering OCR using Deep Learning and OpenCV-Python
Udemy
The Introduction to Quantum Computing
Saint Petersburg State University
Cutting-edge AI: Deep Reinforcement Learning in Python
Udemy
Deep Learning: Advanced Computer Vision
Udemy
Docker and Kubernetes: The Complete Guide
Udemy
Tensorflow 2.0: Deep Learning and Artificial Intelligence
Udemy
Taming Big Data with Apache Spark and Python — Hands On!
Udemy
Python for Time Series Data Analysis
Udemy
Skills
Libraries/APIs
REST APIs, TensorFlow, Keras, Scikit-learn, OpenCV, LSTM, SpaCy, PyTorch
Tools
ARIMA
Languages
Python 3, Python, SQL, Java, C++, C
Platforms
Docker, Kubernetes, Android, Amazon Web Services (AWS)
Frameworks
Spark, Django, Flask
Storage
PostgreSQL, Databases, MongoDB
Other
Natural Language Processing (NLP), Computer Vision, Machine Learning, Time Series Analysis, Algorithms, Artificial Neural Networks (ANN), Generative Pre-trained Transformers (GPT), Statistics, Convolutional Neural Networks (CNN), Deep Learning, Time Series, Artificial Intelligence (AI), Long-term Evolution (LTE), 5G, Recommendation Systems, Quantum Computing, Operating Systems, Big Data, Deep Reinforcement Learning, Forecasting, Regression, Mathematics, Physics, Chemistry, XLNet, Sentiment Analysis, Machine Translation
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring