Shashwat Khanna, Developer in Delhi, India
Shashwat is available for hire
Hire Shashwat

Shashwat Khanna

Verified Expert  in Engineering

Data Scientist and Software Developer

Location
Delhi, India
Toptal Member Since
May 19, 2022

Shashwat is a seasoned professional with almost ten years of work experience in core data science. He has rich experience designing, developing, and deploying machine learning models for clients across the banking, financial services, insurance, retail, eCommerce, and healthcare sectors. Shashwat is currently in charge of end-to-end product analytics for Shopify's recently launched product.

Portfolio

Freelancer
Python, Spark, OpenAI GPT-3 API, OpenAI GPT-4 API...
Shopify
SQL, Python 3, PySpark, Dimensional Modeling, Kimball Methodology, Data Science...
Clara Analytics
Predictive Analytics, GPT, Generative Pre-trained Transformers (GPT)...

Experience

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Spark, Windows, Ubuntu, R, Python 3

The most amazing...

...thing I've developed is a large-scale advertising keyword income prediction model that helped an aggregator achieve profit lifts of around 20%.

Work Experience

Data Scientist (Freelance)

2022 - PRESENT
Freelancer
  • Worked on several ML and BI engagements across sectors. Created several complex models using both ML models, advanced LLMs, NLP embeddings, etc. Handled several clients in parallel.
  • Created several POCs. Experience in both paid and open source LLMs. Currently exploring the use of LLMs for different use cases.
  • Successfully managed and improved large-scale deployed models. Doubled up both as a data engineer and data scientist.
  • Designed and implemented a large-scale pipeline for handling years of mobility data (around 500 GB per day) and derived inferences using geospatial intelligence methods.
Technologies: Python, Spark, OpenAI GPT-3 API, OpenAI GPT-4 API, Natural Language Processing (NLP), Google AdWords, Web Marketing, Generative Pre-trained Transformer 3 (GPT-3), Regular Expressions, Language Models, Version Control, Git, OpenAI, Custom Models, Front-end, Data Scraping, Recommendation Systems, Web Scraping, ChatGPT, Architecture, Integration, SQL, Team Mentoring, Supervised Learning, Deep Learning, Unsupervised Learning, Prompt Engineering

Senior Data Scientist

2021 - PRESENT
Shopify
  • Handled product analytics for the newly launched product and was in charge of data events instrumentation, data models, analytics dashboards for internal users, and user-facing analytics.
  • Used a combination of PySpark, DBT/SQL, and data visualization tools. Worked closely with multidisciplinary teams such as product managers, UI/UX experts, developers, and senior leadership to develop a data roadmap.
  • Oversaw the beta and GTM launch for the product, providing internal stakeholders with key insights into product usage and adoption and driving product roadmap and prioritization.
  • Defined the experiments, KPIs, and guardrail metrics for a newly launched Shopify plan.
Technologies: SQL, Python 3, PySpark, Dimensional Modeling, Kimball Methodology, Data Science, Data Reporting, ETL Tools, Google BigQuery, A/B Testing, Product Analytics, Key Performance Metrics, Dashboards, Streamlit, Machine Learning, Key Performance Indicators (KPIs), Exploratory Data Analysis, Python, Pandas, Data Analysis, Data Visualization, Reports, Data Analytics, Data Mining, Data Modeling, Google Analytics, REST APIs, Big Data, Scikit-learn, Google Sheets, Office 365, APIs, API Integration, Project Management, eCommerce, Analytics, Business Analysis, Marketplaces, Statistics, Predictive Learning, Google Cloud Platform (GCP), MySQL, Product Development, ETL, Data Pipelines, Data Cleaning, Large Data Sets, Artificial Intelligence (AI), Data Engineering, BigQuery, Business Intelligence (BI), Regular Expressions, Version Control, Git, Data Scraping, Team Mentoring, Supervised Learning, Unsupervised Learning

Senior Data Scientist

2019 - 2021
Clara Analytics
  • Led a team of data scientists and engineers to develop products with a strong focus on NLP using rule-based and deep learning and neural networks—RNN, LSTMs, and autoencoders using Keras.
  • Managed the architecting and development of the organization-level NLP stack using Spark and Spark NLP.
  • Played a key role in creating methodologies for value quantification for clients.
  • Created a document processing pipeline that extracted key information to help insurance adjusters analyze medical histories.
Technologies: Predictive Analytics, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, LSTM Networks, RStudio Shiny, Spark, R, Python 3, Data Science, Data Reporting, Key Performance Indicators (KPIs), PySpark, Amazon Web Services (AWS), Spark ML, Machine Learning, Key Performance Metrics, Product Analytics, ETL Tools, Flask-RESTful, Logistic Regression, Linear Regression, Exploratory Data Analysis, Forecasting, Python, Pandas, RStudio, Data Analysis, Data Visualization, Reports, Data Analytics, Data Mining, REST APIs, Big Data, Scikit-learn, Google Sheets, Office 365, APIs, API Integration, Project Management, Data Extraction, Analytics, Business Analysis, Statistics, Predictive Learning, Data Pipelines, Data Cleaning, OCR, Artificial Intelligence (AI), Data Engineering, Business Intelligence (BI), Regular Expressions, Language Models, Version Control, Git, Data Scraping, Architecture, Integration, SQL, Team Mentoring, Supervised Learning, Deep Learning, Unsupervised Learning

Senior Data Scientist

2013 - 2019
64 Squares Private Limited
  • Led and completed 20+ ML and analytics assignments for clients across sectors such as banking, insurance, retail, and eCommerce and geographies, including the US, UK, and Australia.
  • Worked closely with techniques ranging from traditional statistical models like linear regression and logistic to advanced predictive models such as random forest and gradient boosting.
  • Gained extensive experience with productionizing ML experience using RESTful APIs, batch processes, and database integration.
Technologies: Forecasting, Exploratory Data Analysis, Predictive Analytics, Chatbots, Linear Regression, Logistic Regression, Random Forests, Gradient Boosting, XGBoost, R, Python 3, Data Science, Machine Learning, Data Reporting, Flask-RESTful, Predictive Modeling, Python, Pandas, RStudio, Data Analysis, Data Visualization, Reports, Data Analytics, Data Mining, Data Modeling, REST APIs, Big Data, Scikit-learn, Google Sheets, Office 365, APIs, API Integration, Time Series Analysis, Project Management, Data Extraction, eCommerce, Analytics, Business Analysis, Statistics, Predictive Learning, Docker, MySQL, ETL, Data Pipelines, Data Cleaning, Large Data Sets, Artificial Intelligence (AI), Data Engineering, BigQuery, Business Intelligence (BI), Google Data Studio, Regular Expressions, nbdev, Version Control, Git, Data Scraping, Recommendation Systems, Architecture, Integration, SQL, Team Mentoring, Supervised Learning, Deep Learning, Unsupervised Learning

Senior Consultant

2012 - 2012
Deloitte
  • Worked in the strategy and operations group with a focus on the healthcare and life sciences sector.
  • Worked on growth opportunity/expansion, business plans, impact evaluations, and feasibility studies for various clients.
  • Worked with various clients including state and central governments, hospital chains, large business conglomerates, bilateral funding, and donor agencies.
Technologies: Consulting, Microsoft PowerPoint, Financial Modeling, Strategy, Public Health, Public Policy, Office 365

Large-scale Keyword Income Prediction Model

I worked on a large-scale real-time keyword income prediction model that acted as a key input for the client's Google AdWords bidding system. I was responsible for the entire system's design, development, and implementation, and the project lasted for around one year.

Towards the end of the project, the client was able to realize 20% lifts as compared to the existing models.

Large-scale Medical Records Processing Engine

I designed and developed a large-scale engine for processing and extracting information and insights using medical records. This information was presented to insurance adjusters, which reduced their review time by 50%. The engine was developed using Spark NLP.

Generic Prediction Model for Brick and Mortar Retail Stores

I created a generic model for predicting sales of products sold in brick and mortar retail stores. The model was built and tested on data from two stores but was scalable and generic enough to be trained on thousands of stores spanning various retail chains.

Chatbot for a Large B2B Aggregator

I created a buyer-seller messaging assistant and chatbot to present recommendations for the following message, thereby increasing the engagement rates by around 10% on the platform. The latency of the RESTful API was less than 100 milliseconds.

Languages

SQL, Python, Python 3, R

Libraries/APIs

Pandas, REST APIs, Scikit-learn, XGBoost, Spark ML, Google AdWords, PySpark, Flask-RESTful, TensorFlow, Keras

Tools

Git, Google Sheets, BigQuery, Microsoft PowerPoint, Google Analytics

Paradigms

Data Science, Key Performance Metrics, Business Intelligence (BI), ETL, Dimensional Modeling, Kimball Methodology, Distributed Computing

Platforms

RStudio, Jupyter Notebook, Amazon Web Services (AWS), Linux, Ubuntu, Windows, Google Cloud Platform (GCP), Docker

Industry Expertise

Project Management

Storage

MySQL, Data Pipelines

Other

Natural Language Processing (NLP), Predictive Analytics, Exploratory Data Analysis, Forecasting, Data Reporting, Google BigQuery, Product Analytics, Dashboards, Key Performance Indicators (KPIs), Machine Learning, Unstructured Data Analysis, Predictive Modeling, eCommerce, Analytics, Data Analysis, Data Visualization, Reports, Data Analytics, Data Mining, Office 365, APIs, API Integration, Data Extraction, Business Analysis, Statistics, Predictive Learning, Classification, Regression, Data Cleaning, Artificial Intelligence (AI), OpenAI GPT-3 API, OpenAI GPT-4 API, Language Models, nbdev, Version Control, ChatGPT, Team Mentoring, Supervised Learning, Unsupervised Learning, Econometrics, Finance, Gradient Boosting, Random Forests, Chatbots, ETL Tools, A/B Testing, Data Modeling, Big Data, Time Series Analysis, Deep Learning, Product Development, GPT, Generative Pre-trained Transformers (GPT), Large Data Sets, Generative Pre-trained Transformer 3 (GPT-3), Regular Expressions, OpenAI, Data Scraping, Architecture, Integration, Prompt Engineering, Time Series, LSTM Networks, Logistic Regression, Linear Regression, Feature Engineering, Clustering, Pipelines, OCR, Monitoring, Consulting, Financial Modeling, Strategy, Public Health, Public Policy, Marketplaces, Neural Networks, Modeling, Data Engineering, Google Data Studio, Web Marketing, Custom Models, Front-end, Recommendation Systems, Web Scraping

Frameworks

Spark, RStudio Shiny, Streamlit, Flask

2009 - 2011

Master's Degree in Economics

Indira Gandhi Institute of Development Research - Mumbai, India

2004 - 2007

Bachelor's Degree in Physics

University of Delhi - Delhi, India

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring