Sanket Gupta
Verified Expert in Engineering
Data Scientist and Developer
Sanket has worked on several high impact data science and machine learning projects. He likes to work with business impact and product needs in mind. He is also a thought leader and has a popular blog and a podcast. He is skilled in Python, Pandas, SQL, AWS, Keras, building APIs, and deploying machine learning models to the cloud. He is also proficient at statistical thinking and A/B testing techniques. He is a Certified AWS Developer.
Portfolio
Experience
Availability
Preferred Environment
Jupyter, PyCharm, GitHub, MacOS
The most amazing...
...project I've built was when I helped large fashion retailers to build machine learning models for accurate size recommendations for customers shopping online.
Work Experience
Data Scientist
Y Combinator Company
- Created a spelling correction system in search experience which resulted in $800k incremental annual revenue.
- Implemented a machine learning system to automatically categorize 3M+ products based on their descriptions.
- Built an anomaly detection system to identify anomalies in pricing data using statistical methods.
- Used features like spelling correction, aggregations, filtering, matching, and other advanced search capabilities of ElasticSearch like edge grams. Helped in building the best capabilities out of the system and suggest search improvement functionalities.
- Developed category intent prediction algorithm.
- Cultivated an A/B Testing practice including use of hypothesis.
- Analyzed user activity data including search logs and click data to build analytics tooling.
- Developed a recommendation engine to find alternate products that are cheaper and better.
- Built a Python Flask web app to get training data for search relevance- this system helped improve product ranking and click-through rates.
- Built systems for marketing analytics including cohort analysis.
Design Engineer
Marvell Technology Group
- Created multiple data analysis tooling that analyzed data of circuits and systems.
- Built systems to predict performance of different circuits and systems.
- Used statistical thinking to analyze system failures and implement ideas on how to fix them.
- Presented design ideas to large audiences.
- Developed skills for product thinking and thinking about needs of large user base.
Financial Data Analyst Intern
Credit Suisse
- Analyzed large finance data for stock markets and dividend performance.
- Supported some of the large Credit Suisse customers in their portfolio performance reporting.
Marketing Data Analyst Intern
Exxon Mobil
- Analyzed marketing data of Exxon Mobil division and supported the team in making business decisions.
- Presented to management about how Exxon Mobil can direct and target customers.
Experience
Large Data Analysis Projects
Host of The Data Life Podcast
https://podcasts.apple.com/us/podcast/the-data-life-podcast/id1453716761Calorie Tracker Single Page Responsive Web App
Mining Twitter Data for Sentiment Analysis of Events
Machine Learning Language Classifier from Written Scripts
https://github.com/sanketg10/language-identifier-nlpIt uses a cascade of linear classifier followed by a neural network. The first stage is used to detect if the language has a Roman script or not - this determines the character n-grams that the system would build for training. A linear classifier is followed by a neural network that can detect the exact script from a written language based on features from the first stage.
Recommendation Engine Using Collaborative Filtering and SVD
Image Classification System using CNNs
Video Course on Fundamentals of Data Science
Skills
Languages
Python, SQL, Go, GraphQL, Visual Basic for Applications (VBA), C++
Frameworks
Flask
Libraries/APIs
Pandas, Keras, Natural Language Toolkit (NLTK), REST APIs, Scikit-learn, Vue, NumPy, TensorFlow, React
Tools
PyCharm, Amazon SageMaker, ELK (Elastic Stack), GitHub, Jupyter, GoLand, Microsoft Excel, Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Container Registry (ECR)
Paradigms
Data Science, Object-oriented Programming (OOP)
Platforms
Jupyter Notebook, MacOS, Amazon Web Services (AWS), Linux, Amazon EC2, AWS Lambda
Storage
PostgreSQL, SQLite, MySQL, Amazon S3 (AWS S3), Elasticsearch
Other
Statistics, Machine Learning, Deep Learning, Recurrent Neural Networks (RNNs), Data Analysis, Data Mining, Artificial Intelligence (AI), Convolutional Neural Networks (CNN), Amazon Comprehend, Amazon API Gateway
Education
Master's Degree in Engineering
Columbia University - New York
Bachelor's Degree in Engineering
Nanyang Technological University - Singapore
Certifications
AWS Certified Cloud Practitioner
Amazon
Natural Language Processing Certificate
Udacity
Deep Learning Certificate
Coursera
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring