
Claudio S. De Mutiis
Verified Expert in Engineering
Data Science and Machine Learning Developer
London, United Kingdom
Toptal member since January 21, 2021
Claudio is a senior data scientist with experience in stakeholder management, recruitment, and line management. He is proficient in supervised, unsupervised, and reinforcement learning, including deep learning and neural networks. He has worked on several applications of machine learning and AI, including NLP and computer vision. Claudio has extensive industry experience, including retail/eCommerce, media, high-tech, startups, insurance, and healthcare.
Portfolio
Experience
- Computer Vision - 9 years
- Data Science - 9 years
- SQL - 8 years
- Python - 8 years
- Machine Learning - 8 years
- Natural Language Processing (NLP) - 7 years
- Keras - 5 years
- Scikit-learn - 5 years
Preferred Environment
Amazon Web Services (AWS), Python, MATLAB, Jupyter Notebook, Snowflake, Keras, TensorFlow, Scikit-learn, Pandas, SQL
The most amazing...
...thing I've built and trained is a convolutional neural network for end-to-end driving in a simulator using Keras.
Work Experience
Senior Data Scientist
MINORO LTD.
- Spearheaded a sales forecasting project for a large client in the finance space.
- Trained and evaluated overall sales forecasting models by region.
- Trained and evaluated sales forecasting models tailored to specific business partners.
Machine Learning Engineer
Bilt Technologies Inc
- Contributed to fine-tuning a GPT-4 model for customer benefit personalization.
- Built a vector store POC to be used as part of a retrieval-augmented generation (RAG) system using LlamaIndex and ChromaDB.
- Built a vector store in Google BigQuery, which allows querying vector embeddings and location metadata, including latitude and longitude, at the same time.
- Performed a merchant and benefit data analysis and built a dashboard with Evidence to identify weaknesses and opportunities for improvements.
Data Science Lead
Carrier Global
- Managed a team of three data scientists and one data engineer. Greatly improved the data science team's workflow, internal communication, and knowledge sharing.
- Spearheaded a very large data migration from Amazon S3 files and other data sources to Snowflake, which made data investigations more efficient, increased data transparency, and empowered data science and analytics initiatives.
- Oversaw the delivery of a POC regarding detecting energy outliers based on other users with similar characteristics. The data output was dropped in production with a regular service so the consumer app team could pick up and use it.
- Led the delivery of a POC to forecast user energy consumption based on various features, including weather forecast, temperature setpoints, and hardware characteristics.
Data Scientist
Sema Technologies, Inc
- Brainstormed with another data scientist and wrote documentation on how to discriminate between code written by AI and AI blended code, i.e. partly human and partly AI.
- Wrote Python parsers and scripts to gather and process source code datasets from GitHub in a format appropriate to be fed to our model. Those datasets were split into human and generative AI codes for binary classification.
- Researched and pointed out various sources of model overfitting within the data science team and to my manager.
Lead Data Scientist
Wilmington plc
- Researched and developed forecasting models for the gross written premiums of several insurance lines across the globe (Axco Insurance).
- Researched and developed predictive models for patient numbers for different therapy areas and drugs across the UK (Wilmington Healthcare).
- Contributed to a proof of concept involving the use of large language models (LLMs) for grading and student feedback (Wilmington Training and Education).
- Collaborated with the innovation committee at Wilmington.
Lead Data Scientist
Winnow
- Managed and mentored a team of data scientists. Led strategic projects/insights.
- Supported the development of computer vision models.
- Improved and automated the workflow and processes.
- Improved coding practice and reviewed existing projects' code.
- Managed the Agile way of working and sprint planning within the data science team. Also managed annotation and data quality.
Senior Data Scientist
Sky UK
- Developed and implemented customer churn models for the business.
- Collaborated on a real-time machine learning proof of concept involving anomaly detection on hub telemetry data.
- Interviewed and recruited data scientist candidates and fulfilled line management responsibilities.
Senior Data Scientist
Integral Solutions, Inc.
- Investigated the pros and cons of using different NBA APIs.
- Wrote scripts to retrieve NBA data, process it, and store it on S3.
- Performed feature engineering using team stats and other handcrafted features coming from historical NBA matches.
- Designed, validated, and tested a deep neural network model to predict NBA winners and losers as well as the winning probabilities.
- Collaborated with a software engineer to put the NBA prediction model in production for the first MVP of the project.
- Achieved market-leading accuracy for predicting NBA match outcomes.
- Wrote some documentation, introduced some unit tests, and suggested future developments for the project and actions that could further improve the existing model.
Senior Data Scientist
Notonthehighstreet Enterprises Ltd
- Managed a topic classification NLP project using convolutional neural networks and word embeddings to be used by the partners and operations/customer service team.
- Led a deep neural network recommender system project that led to valuable customer segmentation insights to be used by the product and curation team.
- Collaborated with the digital marketing team to increase the effectiveness of marketing and advertising campaigns as well as SEO.
- Managed a competitor analysis project to be used as insights by the executive team.
- Improved data science workflow and coding practices.
- Redesigned the data science recruitment from scratch.
- Managed, guided, and mentored a mid-level data scientist.
- Built a product bundles graph to visualize insights on products frequently bought together.
- Documented data science projects on a data team wiki.
- Managed a multi-touch digital marketing attribution project using a Markov chain.
Data Scientist
Notonthehighstreet Enterprises Ltd
- Worked on an NLP semantic search project using word embeddings in collaboration with tech and other product stakeholders.
- Built predictive models to evaluate our business partners' success to be used as actionable insights by the partners and operations team.
- Engaged and built relationships with senior stakeholders throughout the business.
- Worked on an external trending/social media influencers/posts ranking project in collaboration with the product and curation team that led to the development of a web app to make their job easier.
- Contributed to creating and introducing a data team learning and development culture.
- Placed an NLP project in production to detect a set of specific things in messages business partners sent to customers to be used as actionable insights by the partners and operations team and to be included in a weekly report.
- Documented data science projects on a data team wiki.
Data Scientist
Mindi Technologies Ltd
- Wrote Python scripts to analyze 36 features of DigitalOcean's servers' data such as droplets_cpu_stime, droplets_cpu_utime, droplets_network_rxbytes, and droplets_network_txbytes.
- Worked with the server's droplets of nine different sizes (512 MB, 1GB, 2GB, 4GB, 8GB, 16GB, 32GB, 48GB, and 64GB).
- Tried to infer server and droplet power usages from the datasets provided by DigitalOcean.
AI Researcher
King's College London
- Worked on a project that was part of a collaboration between researchers in artificial intelligence, telecommunications, and environmental sciences. The project was carried out in partnership with Transport for London (TFL) and Ericsson.
- Used artificial intelligence planning to contribute to the design of the next generation of intelligent urban traffic controls (i.e., AI-controlled traffic lights, speed limits, and route planning).
- Visited the TFL operational center and learned about the SCOOT system. Learned about traffic systems used in other main cities around the world.
- Studied the papers written by some of the world's most prominent research groups on traffic optimization.
- Used traffic simulation tools such as SUMO (simulation of urban mobility) and PTV Vissim to simulate congestion scenarios in London.
- Wrote Python scripts that were part of the framework used to interface the DINO AI planner and SUMO.
- Attended the 26th International Conference on Automated Planning and Scheduling (ICAPS 2016) in London.
- Guided and mentored a couple of students in the Master of Science degree program.
Experience
Advanced Lane Finding
Vehicle Detection and Tracking
Using Deep Learning to Clone Driving Behavior
Traffic Sign Classification
Local Odometry Techniques for a Differential Wheeled Robot
• Wrote a library of high-level odometrical functionalities (i.e., Java methods) to allow MIRTO to perform actions such as rotating, translating, and moving towards a specific point in space while avoiding all obstacles on the way.
• Developed a navigation algorithm that only uses MIRTO's wheels' encoders and bumpers sensors.
• Used MIRTO to test the newly developed navigation algorithm.
• Discussed the results and observations in my MSc project's dissertation.
Predicting Boston Housing Prices
Education
Master of Science Degree in Robotics
King’s College London - London, United Kingdom
Bachelor of Arts Degree in Physics
Boston University - Boston, MA, USA
Bachelor of Arts Degree in Economics
Boston University - Boston, MA, USA
Certifications
Natural Language Processing with Probabilistic Models
DeepLearning.AI | via Coursera
Natural Language Processing with Classification and Vector Spaces
Coursera
Practical Time Series Analysis
The State University of New York | vis Coursera
Machine Learning: Clustering & Retrieval
Coursera
Machine Learning
Stanford University via Coursera
Machine Learning Specialization
University of Washington via Coursera
Machine Learning: Classification
Coursera
Machine Learning: Regression
Coursera
Skills
Libraries/APIs
Pandas, Scikit-learn, TensorFlow, Keras, OpenCV, Matplotlib, NumPy, OpenAI API, Camera API
Tools
MATLAB, LaTeX, Slack, Jira, GitHub, Jupyter, Git, BigQuery, ARIMAX, Google Analytics, Tableau, Ansible, Jenkins, Mesos, Confluence, ChatGPT, ARIMA, Prefect
Languages
SQL, Snowflake, Java, Python, Python 3, C++, C, Fortran, R
Paradigms
Real-time Systems, Anomaly Detection, Agile, MapReduce, Continuous Integration (CI)
Platforms
Jupyter Notebook, Amazon Web Services (AWS), Docker, Amazon EC2, Google Cloud Platform (GCP), Citrix, SharePoint, Kubernetes, Kubeflow
Storage
Relational Databases, MySQL, Amazon S3 (AWS S3), Google Cloud Storage
Industry Expertise
Banking & Finance, Retail & Wholesale, Marketing
Frameworks
LlamaIndex, Streamlit
Other
Linear Algebra, Calculus, Differential Equations, Computational Physics, Quantum Physics, EM Waves, Mechanics, Physics, Microeconomics, Macroeconomics, Game Theory, Labor, Complex Variables, Theoretical Physics, Computer Vision, Artificial Intelligence (AI), Multi-agent Systems, Biotechnology, Sensors & Actuators, Pattern Recognition, Regression, Classification, Information Retrieval, Expectation-Maximization (EM), Natural Language Processing (NLP), Word Embedding, Reinforcement Learning, Self-driving Cars, Color Grading, SVMs, Convolutional Neural Networks (CNNs), Robotics, Navigation, Visual Odometry, Data Science, Machine Learning, Conversion Rate, Data Analysis, Data Reporting, Algorithms, Recommendation Systems, Document Processing, Neural Networks, Unsupervised Learning, Dimensionality Reduction, Data Analytics, eCommerce, Supervised Learning, Data, Big Data, Writing & Editing, Documentation, Data Modeling, Statistics, Deep Learning, Image Recognition, Object Tracking, APIs, Data Preprocessing, Text Analytics, Classification Algorithms, Deep Neural Networks (DNNs), Predictive Modeling, Data Processing, Minimum Viable Product (MVP), Feature Engineering, Churn Analysis, Real-time Data, Supervised Machine Learning, Autoencoders, Google BigQuery, Sprint Planning, Monetary Policy, Clustering, Environment, Mathematics, Generative Pre-trained Transformers (GPT), Forecasting, Data Visualization, Frameworks, Data Scientist, Regression Modeling, Quantitative Analysis, OpenAI GPT-3 API, Language Models, Search Engines, Data Wrangling, Statistical Data Analysis, Computer Vision Algorithms, Data Cleansing, Leadership, OpenAI, Datasets, Calibration, Google SEO, Optical Character Recognition (OCR), Chatbots, Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), CHRONOS, Statistical Analysis, Mode Analytics, Economics, Planning, Time Series, Time Series Analysis, Machine Translation, Locality-Sensitive Hashing, Sentiment Analysis, Vector Space Models, Linear Regression, Ridge Regression, Lasso Regression, Logistic Regression, Decision Trees, K-means Clustering, K-D Tree, Word2Vec, Parts-of-Speech Tagging, POS, N-gram Language Models, Continuous Bag of Words (CBOW), Autocomplete, Clustering Algorithms, Energy Management, Data Migration, K-nearest Neighbors (KNN), Exploratory Data Analysis, LangChain, ChromaDB, OpenAI GPT-4 API, LLM Fine-tuning, Evidence, Vector Stores, Retrieval-augmented Generation (RAG), random forest
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring