Ashutosh Tripathi, Developer in Yokohama, Kanagawa Prefecture, Japan
Ashutosh is available for hire
Hire Ashutosh

Ashutosh Tripathi

Verified Expert  in Engineering

Data Scientist and Developer

Yokohama, Kanagawa Prefecture, Japan
Toptal Member Since
September 14, 2021

Ashutosh is a data scientist with over four years of experience in artificial intelligence, data analytics, and software development, specializing in natural language processing, computer vision, and time-series analytics. Some of the industries that Ashutosh has worked in include: advertising/marketing, insurance, IT, telecommunications, and software. Ashutosh has a demonstrated history of successfully developing intelligent solutions and deploying them in production.


Artificial Intelligence (AI), Machine Learning, Computer Vision...
Python 3, Generative Pre-trained Transformers (GPT)...
Rakuten Mobile
Python 3, Natural Language Processing (NLP)...




Preferred Environment

Python 3, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Computer Vision, Time Series, Deep Learning, Artificial Intelligence (AI), Algorithms

The most amazing...

...projects: building an NLP engine for sentiment analysis, topic classification, and risk assessment and a solution to predict the network load in telecom cells.

Work Experience

Senior Machine Learning Engineer

2021 - PRESENT
  • Fetched relevant FAQs on the input text (a FAQ-based chatbot). Used a weighted combination of similarity metrics, like Euclidean distance, Cosine Similarity, and Word Mover distance and some custom metrics to determine the similarity between input and FAQ.
  • Used text and other inputs to detect fraudulent claims in insurance (fraud detection), employing a supervised deep learning architecture to classify claims as fraudulent and non-fraudulent.
  • Detected the sentiment of post images and just used more common text features to classify sentiment into positive, negative, and neutral (sentiment analysis with added image information).
Technologies: Artificial Intelligence (AI), Machine Learning, Computer Vision, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), TensorFlow, Python, PyTorch, Amazon Web Services (AWS)

Senior Data Scientist

2020 - 2021
  • Developed a brand index, using NLP and SNS data, to evaluate performance in customer satisfaction, growth, revenue, profit, competitor, and so on; used Python 3, TensorFlow, Kera, scikit-learn, spaCy, NLTK, MeCab, and more.
  • Constructed a Python 3 multi-touch attribution model that evaluates the performance of various campaigns, attributes, customer actions and generates a metric to evaluate their overall effectiveness.
  • Implemented various statistical tests using Python 3 and several statistical libraries to prove or disprove various hypotheses relating to consumer behavior.
  • Built a Python 3 tool to perform sentiment analyses, text classification, and risk evaluations on various data types, including SNS, store, customer support, and consumer's voice; used TensorFlow, Keras, scikit-learn, spaCy, NLTK, MeCab, and more.
  • Used neural style transfer to design computer-generated posters for campaigns, using different themes; it was developed in Python 3, using TensorFlow and Keras.
Technologies: Python 3, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Computer Vision, Time Series Analysis, Docker, Databases, Statistics, TensorFlow, Keras, Scikit-learn, SpaCy, Python, SQL, Convolutional Neural Networks (CNN)

AI Architect

2019 - 2020
Rakuten Mobile
  • Built a natural language engine that classifies and analyzes text data specifically by performing sentiment analysis, classifying it into predefined categories, evaluating risk, and highlighting important topics being discussed.
  • Created a capacity planning tool that does time series forecasting and regression analysis to predict loads on the telecom network; it predicts loads at a single cell cluster and network level and then recommends either to add new cells or remove extra cells.
  • Developed a classification module to predict whether customer churn will occur.
  • Built a real-time anomaly detection module based on time series; it detects real-time anomalous behavior in thousands of KPIs and raises alarms to stakeholders in case of anomaly detection.
  • Developed a data science platform to perform data analytics, build AI solutions, and productizing AI applications.
Technologies: Python 3, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Computer Vision, Docker, Kubernetes, TensorFlow, Deep Learning, Time Series, Statistics, Django, Flask, Recommendation Systems, Python, SQL, Convolutional Neural Networks (CNN)

Software Developer

2018 - 2019
  • Designed and developed the complete back end using Django, Flask, PostgreSQL, and MongoDB.
  • Developed a robust payment solution (a payment API) in Python 3 and using Flask; it provides solutions for 1-time payments, subscribed recurring payments, notifications and reminders.
  • Integrated various third-party APIs which included payment gateways, insurance providers, fitness brands, pharmaceutical companies, and credit providers.
Technologies: Python 3, Django, PostgreSQL, MongoDB, Docker, Kubernetes, Amazon Web Services (AWS), Flask, Python, SQL

Data Scientist

2017 - 2019
Samsung Research
  • Engineered a machine learning model—based on natural language processing and built using Python 3—to automatically assign new issues raised by the Q/A team to the right engineering team.
  • Developed an Android application to parse logs from mobile devices; it was built using Android Studio and Java and based on 3GPP references.
  • Made an Android application that can read and write the content of a USIM; it was built using Android Studio and Java and based on 3GPP references and the Android Telephony API.
Technologies: Python 3, Machine Learning, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Statistics, Java, Long-term Evolution (LTE), 5G, TensorFlow, Scikit-learn, C++, Python, SQL

Natural Language Engine (NLP)

An NLP engine that performs various tasks, including sentiment analysis, text classification, risk analysis, and real-time dashboards. I built this artificial intelligence model using Python 3 and TensorFlow. I was the project owner and worked on the project alone.

The model is based on a deep recurrent neural network (bidirectional LSTM) and attention mechanism. The model performs three tasks. First, it assigns a sentiment characterization to the text, which can be negative, neutral, or positive. Second, It assigns a predefined category to the text (I cannot mention the categories due to privacy agreement). Third, it assigns a risk associated, i.e., whether it's low, medium, or high.

The model is then transformed into a REST API using Django and Flask API. Also, the model is connected to a PostgreSQL database and dashboard. The app is then wrapped inside a containerized Docker solution which is then placed in a Kubernetes cluster.

Capacity Forecasting Engine

A time series forecasting engine for predicting the capacity of telecom radio cells. Future capacity is vital because if the telecom radio cells load increases, there's a corresponding degradation in the user experience like slow data rates, call drops, and so on.

The goal was to better prepare for the future. I was the project owner and worked on the project alone.

The model is combination of regression model and time series model. KPIs like data traffic and number of users are forecasted using combination of models like ARIMA, Holt-Winters, Facebook Prophet, and deep learning-based models. This is done for each individual cells.

We also built a regression model using recent data for predicting KPIs like PRB utilization and user throughput against KPIs used for forecasting. Using combination of these two forecasting and regression models, we calculated capacity. The model is then converted into an application using the REST API, Docker, and Kubernetes.

Book Review Sentiment Analysis

A sentiment analysis engine for predicting positive, negative, and neutral sentiments on book review data. I was the project owner and worked alone on the project.

I built two models. The first dealt with reviews that were less than 250 words. The second was for reviews greater than 250 words. The first model was based on the XLNET transformer. A Hugging Face transformer library was used to build and fine-tune the model. The model gave results with an accuracy of 91%.

The second model was built using Bi-LSTM and with TensorFlow and Keras. The review was first tokenized using sentences and fed to LSTM as a time series data point. I then used Universal Sentence Encoder to get pre-trained sentence embedding. This model gave results with an accuracy of 90%.

Machine Translation

A natural language machine translation app designed to convert Hindi to English and vise versa. This was my end year project during college. This is based on sequence-to-sequence architecture.

It uses an encoder and decoder. The encoder and decoder are based on bi-directional LSTM. It also uses a global attention mechanism to better learn long-range dependencies and focus on more relevant inputs from sentences. The model was built using TensorFlow and Keras.

LSIL Detection

This is a biological cell image classification project. Low-grade squamous intraepithelial lesion (LSIL) is a common abnormal result on a pap test. It's also known as mild dysplasia. LSIL means that your cervical cells show mild abnormalities.

The goal is to detect whether the cell is LSIL or normal. It's a binary classification task on image data. There were, in total, 10,000 labeled data which 50% of data was LSIL and 50% was normal. The model was built using Keras and TensorFlow ad a VGG 16 model was used for fine-tuning. The overall accuracy was 97%.
2020 - 2021

Post Graduation Program in AI-ML in Artificial Intelligence

McCombs School of Business, University of Texas at Austin - Remote

2013 - 2017

Bachelor's Degree in Computer Science

Indian Institute of Technology, Patna - Patna, India

2011 - 2013

High School Diploma in Physics, Chemistry, and Math

Bethany Convent School - Allahabad, India


AI-ML Post Graduate Certification.

McCombs School of Business, University of Texas at Austin


Mastering OCR using Deep Learning and OpenCV-Python



The Introduction to Quantum Computing

Saint Petersburg State University


Cutting-edge AI: Deep Reinforcement Learning in Python



Deep Learning: Advanced Computer Vision



Docker and Kubernetes: The Complete Guide



Tensorflow 2.0: Deep Learning and Artificial Intelligence



Taming Big Data with Apache Spark and Python — Hands On!



Python for Time Series Data Analysis



REST APIs, TensorFlow, Keras, Scikit-learn, OpenCV, LSTM, SpaCy, PyTorch


Python 3, Python, SQL, Java, C++, C


Docker, Kubernetes, Android, Amazon Web Services (AWS)


Spark, Django, Flask


PostgreSQL, Databases, MongoDB


Natural Language Processing (NLP), Computer Vision, Machine Learning, Time Series Analysis, Algorithms, Artificial Neural Networks (ANN), Generative Pre-trained Transformers (GPT), Statistics, Convolutional Neural Networks (CNN), Deep Learning, Time Series, Artificial Intelligence (AI), Long-term Evolution (LTE), 5G, Recommendation Systems, Quantum Computing, Operating Systems, Big Data, Deep Reinforcement Learning, Forecasting, ARIMA, Regression, Mathematics, Physics, Chemistry, XLNet, Sentiment Analysis, Machine Translation

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring