Guillaume Ferry, Developer in Paris, France
Guillaume is currently unavailable

Guillaume Ferry

Data Scientist and Developer

Paris, France

Toptal member since August 22, 2017

Bio

As an experienced machine learning practitioner (Kaggle expert), data engineer, and architect, Guillaume builds AI applications and systems for clients. He develops statistical models, ETL jobs, APIs, and basic front ends, and researches new solutions and approaches for algorithm scaling.

Portfolio

Gigafactory in France
Azure DevOps, Computer Vision, Retrieval-augmented Generation (RAG)...
Startup for Beverages Management in France
PostgreSQL, SQLAlchemy, Amplitude, AWS Lambda, Flask, Amazon EC2, Elasticsearch...
BCG
Amazon SageMaker, Docker, Data Build Tool (dbt), Snowflake, TensorFlow...

Experience

  • SQL - 10 years
  • Scikit-learn - 10 years
  • Python - 10 years
  • Amazon Web Services (AWS) - 5 years
  • Flask - 5 years
  • TensorFlow - 5 years
  • Spark - 5 years
  • Tableau - 2 years

Preferred Environment

Amazon Web Services (AWS), Jupyter, SQL, Python, Linux, Flask

The most amazing...

...thing I've done was implement a vector database POC with search capabilities in 2018.

Work Experience

Data Project Manager

2024 - 2025
Gigafactory in France
  • Served as a project manager/product owner on AI and computer vision projects.
  • Led a team of 3-7 research data scientists in an Azure/Databricks environment.
  • Developed a POC for data compression and prepared a preliminary dataset for an LLM RAG application.
Technologies: Azure DevOps, Computer Vision, Retrieval-augmented Generation (RAG), Anomaly Detection, Data Science, Large Language Models (LLMs)

Senior Data Engineer

2024 - 2024
Startup for Beverages Management in France
  • Cut all memory overflows from the application's back end.
  • Improved the application's search engine (Elasticsearch).
  • Implemented Slack alerts based on AWS logs and developed in PostgreSQL/SQLAlchemy.
Technologies: PostgreSQL, SQLAlchemy, Amplitude, AWS Lambda, Flask, Amazon EC2, Elasticsearch, Pandas

Data Scientist, Global Marketing

2019 - 2021
BCG
  • Deployed two fully customized recommender system APIs for BCG.com.
  • Implemented A/B testing and benchmarked against AWS Personalize.
  • Contributed to the website data migration and marketing Tableau dashboards.
Technologies: Amazon SageMaker, Docker, Data Build Tool (dbt), Snowflake, TensorFlow, Scikit-learn, Pandas, Data Science, Amazon API Gateway, Amazon DynamoDB

Freelance Data Engineer and Architect

2018 - 2019
Johnson & Johnson
  • Set up the data lake architecture to host source data.
  • Connected the data lake to an IoT platform through API.
  • Wrote transformation and loading scripts for feeding an application database.
Technologies: Spark, AWS Lambda, AWS Glue

Freelance Data Lab Data Scientist | Architect

2018 - 2018
Johnson & Johnson
  • Set up the back-end architecture for operationalizing a machine-learning PoC to an MVP.
  • Developed the ETL for retrieving and updating public data sources.
  • Created a search engine API based on the Word2Vec model (state of the art in NLP at the time).
Technologies: Amazon Web Services (AWS), PostgreSQL, Flask, Scikit-learn, Gensim, Word2Vec, Python, Machine Learning, Natural Language Processing (NLP)

Freelance Senior Data Scientist

2017 - 2018
Credit Insurance Company (via Toptal)
  • Developed a data exploration with Python and Jupyter.
  • Reviewed the company strategy with the founders after extracting insights from the data.
  • Set up starter machine learning models with Jupyter.
Technologies: Scikit-learn, Jupyter, Python

Analytics Consultant

2017 - 2017
Reinsurance Company in Paris
  • Designed the data architecture of the business service.
  • Developed POC/MVP applications in Qt/R, including an ETL tool.
  • Promoted data science and machine learning approaches.
Technologies: Microsoft SQL Server, R, Qt

Manager | Data Scientist

2015 - 2016
A Global Management Consulting and Professional Services Company
  • Developed an optimization prototype for a clearing bank in Python / CPLEX, in partnership with MIT.
  • Architected and modeled data for a new business service.
  • Developed a mobile payment application, involving data lake specifications coordination, data strategy definition, and report prototyping.
  • Worked on the data science commercial offer for the banking sector.
Technologies: Tableau, SQL, Scikit-learn, CPLEX, Python

RTB Data Scientist

2015 - 2015
Twenga
  • Built predictive models.
  • Worked with machine learning.
  • Conducted AB testing in Python.
Technologies: Scikit-learn, SQL, Python

Developer of Real-time Prediction of Train Delays

2014 - 2014
SNCF (French Railways) | R&D, Statistics, Econometrics, and Data-mining
  • Developed a Python prototype for real-time prediction of train delays—from the IT architecture to coding and integration.
Technologies: Scikit-learn, MySQL, Python

Developer (Prototype Pushing Personalized Press Articles)

2013 - 2014
Ownpage | Full-year Scholar Project
  • Coded a multithreaded Java crawler, Lucene indexer, and retriever.
  • Installed and maintained the production environment.
Technologies: Apache, MySQL, Ruby on Rails (RoR), JRuby, Ubuntu, Apache Lucene, Multithreading, Java

Transformation Program Analyst for Euler Hermès

2011 - 2013
Business Consulting Company
  • Acted as the roll-out coordinator in Germany—involving project management, training, and technical/functional advising for one year.
  • Designed the functional architecture. Consulted on IT for a subsidiary roll-out (business need, functional specifications, parameters set up).
  • Reworked the business processes of contract management departments of the subsidiaries (Hamburg, London, Rome, Brussels, and Paris) with a Lean-Six Sigma approach.
Technologies: Microsoft PowerPoint, Microsoft Excel

Technical Business Analyst

2007 - 2011
BNP Paribas Cardif
  • Mapped back-office processes and implemented improvements.
  • Developed dashboards and Excel, Access, and VBA tools.
  • Process mapped and implemented indicators and controls for subsidiaries to cover critical IT risks.
Technologies: SQL, Microsoft Excel, Microsoft Access, Visual Basic for Applications (VBA)

Auditor

2007 - 2007
CA CIB | Back-office
  • Audited for quality and regulatory compliance.
  • Developed an Access database for cross-analysis of risks vs controls.
Technologies: Visual Basic for Applications (VBA), Microsoft Access

Experience

Kaggle Competition Expert

https://www.kaggle.com/guillaumeferry/competitions
Kaggle is a data science competition platform and online community for data scientists and machine learning practitioners under Google LLC.

ACHIEVEMENTS
• Facebook Challenge (IV): Reached the top 7%.
• Wikipedia Web Traffic Time Series Forecasting: Reached the top 8%.

Recommender System APIs for BCG.com

Deployed two fully customized recommender system APIs for BCG.com.
One was for article recommendation according to a user's history, the other was a "similar article" (purely based on content).
Implemented A/B testing and benchmarked against Amazon Personalize.

Education

2013 - 2014

Master of Science Degree in Machine Learning, Big Data, and Statistics

Télécom Paris - Paris, France

2005 - 2006

Participated as an Exchange Student in Industrial Engineering

Concordia University - Montreal, QC, Canada

2002 - 2006

Engineer's Degree in Industrial Engineering

Ecole Mines of Douai - Douai, France

Certifications

DECEMBER 2013 - PRESENT

Machine Learning

Stanford University via Coursera

Skills

Libraries/APIs

Scikit-learn, TensorFlow, Apache Lucene, Keras, SQLAlchemy, Pandas, NumPy

Tools

Jupyter, Microsoft Access, Microsoft Excel, Microsoft PowerPoint, CPLEX, Apache, Gensim, AWS Glue, Tableau, Amazon SageMaker

Languages

Python, SQL, R, Visual Basic for Applications (VBA), JRuby, Java, Snowflake

Frameworks

Flask, Spark, Ruby on Rails (RoR), Qt, Hadoop

Platforms

Amazon Web Services (AWS), Linux, Ubuntu, Amazon EC2, AWS Lambda, Docker

Paradigms

Azure DevOps, Anomaly Detection

Storage

MySQL, Microsoft SQL Server, PostgreSQL, Elasticsearch, Amazon DynamoDB

Other

Data Science, Multithreading, Word2Vec, Data Build Tool (dbt), Amplitude, Computer Vision, Retrieval-augmented Generation (RAG), Amazon API Gateway, Large Language Models (LLMs), Machine Learning, Natural Language Processing (NLP), Data Engineering

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring