Christophe Huguet, Developer in Toulouse, France
Christophe is available for hire
Hire Christophe

Christophe Huguet

Verified Expert  in Engineering

Data Engineering Developer

Location
Toulouse, France
Toptal Member Since
April 23, 2021

Christophe is an AWS-certified data engineer with extensive experience building enterprise data platforms. He has strong skills in designing and building cloud-native data pipelines, data lakes, and data warehouse solutions. Team spirit, kindness, and curiosity are essential values ​​that drive Christophe.

Portfolio

Dashlane
Apache Airflow, Redshift, Data Build Tool (dbt), AWS Glue, Amazon Athena, Spark...
Continental
Amazon Web Services (AWS), Amazon S3 (AWS S3), Amazon Elastic MapReduce (EMR)...
Airbus
Hadoop, Spark, Apache Hive, Oozie, Apache Sqoop, Scala, Python, Talend...

Experience

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Spark, SQL, Data Pipelines, ETL, Data Lakes, Terraform, Apache Airflow, Data Build Tool (dbt), Python

The most amazing...

...professional challenge I've participated in is an international ML competition for Continental (Carino), where I ranked first in France.

Work Experience

Senior Data Engineer

2021 - PRESENT
Dashlane
  • Acted as a key contributor to the design and implementation of a new cloud-native data platform, replacing a legacy data stack.
  • Built numerous AWS-based data pipelines to acquire data from various sources, prepare them, and load them into a data warehouse.
  • Implemented a monitoring and alerting stack based on Cloudwatch, SNS, Lamba, and Slack, to ensure the quality of our data pipelines. This stack included data quality checks, pipeline status reporting, and Slack notifications.
  • Built data models with DBT on Redshift to extract business value from our data and serve it via Tableau dashboards to management and marketing teams.
  • Migrated data from relational databases (SQL Server and MySQL) to a Redshift data warehouse.
  • Built complex streaming pipelines receiving over 100 million daily events.
Technologies: Apache Airflow, Redshift, Data Build Tool (dbt), AWS Glue, Amazon Athena, Spark, Amazon Kinesis, Terraform, Amazon S3 (AWS S3), Amazon CloudWatch, Amazon Elastic Container Service (Amazon ECS), SQL Server 2016, Python 3, AWS Database Migration Service (DMS), Airbyte, Data Modeling, Data Engineering, Continuous Integration (CI), Databases, Amazon Web Services (AWS), Agile, Data Migration, Data Lakes, Data Architecture, Database Schema Design, Microsoft SQL Server, SQL DML, Python, PySpark, Data Pipelines, Pipelines

Senior Data Engineer

2018 - 2021
Continental
  • Contributed to designing and developing a new data platform used by all internal teams to share, search, and explore datasets.
  • Designed a fleet management and vehicle tracking solution. Cloud-native and event-driven, the solution scales for millions of vehicles.
  • Served as an enterprise expert for AWS architectures and data pipeline design. I was in charge of auditing and advising teams.
  • Built highly scalable data pipelines based on Kafka, Spark Streaming, Spark, No-SQL, and S3 components.
  • Designed and deployed the security stack for authentication and authorization of users and third-party clients. Based on OAuth2 and mTLS.
  • Performed complex data modeling for database and data warehouse systems.
Technologies: Amazon Web Services (AWS), Amazon S3 (AWS S3), Amazon Elastic MapReduce (EMR), Amazon Athena, Spark, Apache Kafka, SQL, PostgreSQL, MongoDB, NoSQL, MQTT, Kubernetes, Docker, Data Engineering, Continuous Integration (CI), Amazon CloudWatch, Amazon Elastic Container Service (Amazon ECS), Databases, Data Pipelines, Pipelines

Senior Data Engineer

2016 - 2018
Airbus
  • Served as the tech lead of the data ingestion team on Airbus's main data platform, Skywise.
  • Built ETL pipelines that ingested and processed several terabytes of data every day.
  • Industrialized and automated the build and deployment of new data ingestion pipelines. Divided by ten, the development time of new pipelines.
  • Provided best practices and guidance to other teams working on the data platform as a part of the central architecture team.
  • Developed an NLP application to detect and filter personal information in the ingested data.
Technologies: Hadoop, Spark, Apache Hive, Oozie, Apache Sqoop, Scala, Python, Talend, Elasticsearch, Machine Learning, Scikit-learn, Amazon Web Services (AWS), Databases, Modeling, Data Modeling, Pipelines

Big Data Architect

2015 - 2016
Airbus
  • Acted as a technical architect on a pilot project for the Airbus Hadoop platform. In charge of the architecture dossier of the project. Airbus awarded the project.
  • Designed and participated in the development of a solution for the anticipation of problems on Airbus manufacturing plants.
  • Evaluated the benefits of big data technologies for six Airbus projects. Held presentations and workshops with business owners and technical teams.
  • Developed several prototypes with Spark and Hadoop. Created a prototype to detect abnormal flight paths of aircraft.
Technologies: Hadoop, Amazon Web Services (AWS), Spark, Python, API Architecture, Security, Apache Hive

Technical Lead

2007 - 2014
Capgemini
  • Served as a tech lead on critical Java and Jakarta EE projects for several clients, such as SFR, Orange, ACOSS, Ministry of Defense, and Snecma.
  • Discovered many varied technical environments and gained important experience in software development and solution design.
  • Adapted to various technical environments and organizations, including two months of working with an Indian offshore team as an expat.
  • Provided expertise on major technical crises and worked on critical projects.
Technologies: Java, JEE, Java Application Servers, SQL, PostgreSQL, Oracle, PKI, Jenkins

New Data Platform for Dashlane

http://www.dashlane.com
As Senior Data-Engineer in a five people data engineering team, I designed and built a modern data platform, replacing the legacy one.
The new data platform is based on AWS serverless services, and modern technologies like Airflow, DBT, and Airbyte.
The new platform acquires over 100 million events and 1TB of data daily.
It comes with Continuous Deployment (Terraform and Gitlab CI), real-time monitoring and alerting (Cloudwatch and Slack), and low operational costs (pay-per-use pricing of serverless services).

Real-time Data Pipeline to Compute Tolling for Millions of Vehicles

At Continental, we built a tolling solution in charge of collecting locations from millions of vehicles and computing tolling charges based on geographic pricing rules.

I designed and contributed to the development of the main pipeline in charge of the tolling. It is based on Kafka, Spark Streaming, Spark SQL, Go microservices, PostgreSQL, and S3. We managed to build a fault-tolerant and horizontally scalable pipeline. To achieve a high level of quality, we had made a significant effort on the CI/CD, including all infra-as-code, automatic unit tests, and continuous integration tests.

Data Acquisition for Airbus Data Platform (Skywise)

https://aircraft.airbus.com/en/services/enhance/skywise
I contributed to the early stages of the construction of Airbus' main Data platform. As technical leader of the data acquisition team, I built a solution that enabled our team to configure and deploy new data acquisition pipelines in a single day.

We handled various data sources (ERP, Aircraft, partners, HR, Finance, production plants, etc.) and ingested over 5TB of data daily.

Languages

Scala, SQL, Python, Python 3, SQL DML, Java, T-SQL (Transact-SQL)

Frameworks

Spark, Hadoop

Tools

Terraform, Amazon Athena, Apache Airflow, AWS Glue, Amazon CloudWatch, Amazon Elastic MapReduce (EMR), MQTT, Amazon Elastic Container Service (Amazon ECS), Oozie, Java Application Servers, Jenkins, Apache Sqoop

Paradigms

ETL, Database Design, OLAP, API Architecture, Agile, Continuous Integration (CI), Data Science

Platforms

Amazon Web Services (AWS), Jupyter Notebook, Apache Kafka, Kubernetes, Docker, JEE, Talend, Oracle, Airbyte, AWS Lambda

Storage

Data Pipelines, Data Lakes, Amazon S3 (AWS S3), PostgreSQL, Databases, Redshift, JSON, MongoDB, NoSQL, Apache Hive, Elasticsearch, Microsoft SQL Server, SQL Server 2016

Other

Data Engineering, Data Migration, Amazon Kinesis, Data Architecture, Big Data Architecture, Database Schema Design, Schemas, Pipelines, Data Build Tool (dbt), Security, Enterprise Architecture, AWS Database Migration Service (DMS), Data Modeling, Modeling, Integration, Data Analytics, Amazon RDS, Artificial Intelligence (AI), Machine Learning, PKI, Metabase

Libraries/APIs

PySpark, Scikit-learn

2005 - 2007

Master's Degree in Computer Engineering

Georgia Institute of Technology, University of Atlanta - Atlanta, GA, USA

2003 - 2006

Engineer's Degree in Computer Science

CentraleSupélec, Grande École of Engineering - Paris, France

FEBRUARY 2021 - PRESENT

Certified Kubernetes Application Developer (CKAD)

The Linux Foundation

DECEMBER 2019 - PRESENT

AWS Certified Solutions Architect - Professional

Amazon Web Services

NOVEMBER 2019 - PRESENT

AWS Certified Big Data - Specialty

Amazon Web Services

AUGUST 2017 - PRESENT

Spark and Hadoop Developer Certification (CCA175)

Cloudera

SEPTEMBER 2014 - PRESENT

IAF Certified Architect

Capgemini

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring