Christophe Huguet
Verified Expert in Engineering
Data Engineering Developer
Toulouse, France
Toptal member since April 23, 2021
Christophe is an AWS-certified data engineer with extensive experience building enterprise data platforms. He has strong skills in designing and building cloud-native data pipelines, data lakes, and data warehouse solutions. Team spirit, kindness, and curiosity are essential values that drive Christophe.
Portfolio
Experience
Availability
Preferred Environment
Amazon Web Services (AWS), Spark, SQL, Data Pipelines, ETL, Data Lakes, Terraform, Apache Airflow, Data Build Tool (dbt), Python
The most amazing...
...professional challenge I've participated in is an international ML competition for Continental (Carino), where I ranked first in France.
Work Experience
Senior Data Engineer
Dashlane
- Acted as a key contributor to the design and implementation of a new cloud-native data platform, replacing a legacy data stack.
- Built numerous AWS-based data pipelines to acquire data from various sources, prepare them, and load them into a data warehouse.
- Implemented a monitoring and alerting stack based on Cloudwatch, SNS, Lamba, and Slack, to ensure the quality of our data pipelines. This stack included data quality checks, pipeline status reporting, and Slack notifications.
- Built data models with DBT on Redshift to extract business value from our data and serve it via Tableau dashboards to management and marketing teams.
- Migrated data from relational databases (SQL Server and MySQL) to a Redshift data warehouse.
- Built complex streaming pipelines receiving over 100 million daily events.
Senior Data Engineer
Continental
- Contributed to designing and developing a new data platform used by all internal teams to share, search, and explore datasets.
- Designed a fleet management and vehicle tracking solution. Cloud-native and event-driven, the solution scales for millions of vehicles.
- Served as an enterprise expert for AWS architectures and data pipeline design. I was in charge of auditing and advising teams.
- Built highly scalable data pipelines based on Kafka, Spark Streaming, Spark, No-SQL, and S3 components.
- Designed and deployed the security stack for authentication and authorization of users and third-party clients. Based on OAuth2 and mTLS.
- Performed complex data modeling for database and data warehouse systems.
Senior Data Engineer
Airbus
- Served as the tech lead of the data ingestion team on Airbus's main data platform, Skywise.
- Built ETL pipelines that ingested and processed several terabytes of data every day.
- Industrialized and automated the build and deployment of new data ingestion pipelines. Divided by ten, the development time of new pipelines.
- Provided best practices and guidance to other teams working on the data platform as a part of the central architecture team.
- Developed an NLP application to detect and filter personal information in the ingested data.
Big Data Architect
Airbus
- Acted as a technical architect on a pilot project for the Airbus Hadoop platform. In charge of the architecture dossier of the project. Airbus awarded the project.
- Designed and participated in the development of a solution for the anticipation of problems on Airbus manufacturing plants.
- Evaluated the benefits of big data technologies for six Airbus projects. Held presentations and workshops with business owners and technical teams.
- Developed several prototypes with Spark and Hadoop. Created a prototype to detect abnormal flight paths of aircraft.
Technical Lead
Capgemini
- Served as a tech lead on critical Java and Jakarta EE projects for several clients, such as SFR, Orange, ACOSS, Ministry of Defense, and Snecma.
- Discovered many varied technical environments and gained important experience in software development and solution design.
- Adapted to various technical environments and organizations, including two months of working with an Indian offshore team as an expat.
- Provided expertise on major technical crises and worked on critical projects.
Experience
New Data Platform for Dashlane
http://www.dashlane.comThe new data platform is based on AWS serverless services, and modern technologies like Airflow, DBT, and Airbyte.
The new platform acquires over 100 million events and 1TB of data daily.
It comes with Continuous Deployment (Terraform and Gitlab CI), real-time monitoring and alerting (Cloudwatch and Slack), and low operational costs (pay-per-use pricing of serverless services).
Real-time Data Pipeline to Compute Tolling for Millions of Vehicles
I designed and contributed to the development of the main pipeline in charge of the tolling. It is based on Kafka, Spark Streaming, Spark SQL, Go microservices, PostgreSQL, and S3. We managed to build a fault-tolerant and horizontally scalable pipeline. To achieve a high level of quality, we had made a significant effort on the CI/CD, including all infra-as-code, automatic unit tests, and continuous integration tests.
Data Acquisition for Airbus Data Platform (Skywise)
https://aircraft.airbus.com/en/services/enhance/skywiseWe handled various data sources (ERP, Aircraft, partners, HR, Finance, production plants, etc.) and ingested over 5TB of data daily.
Education
Master's Degree in Computer Engineering
Georgia Institute of Technology, University of Atlanta - Atlanta, GA, USA
Engineer's Degree in Computer Science
CentraleSupélec, Grande École of Engineering - Paris, France
Certifications
Certified Kubernetes Application Developer (CKAD)
The Linux Foundation
AWS Certified Solutions Architect - Professional
Amazon Web Services
AWS Certified Big Data - Specialty
Amazon Web Services
Spark and Hadoop Developer Certification (CCA175)
Cloudera
IAF Certified Architect
Capgemini
Skills
Libraries/APIs
PySpark, Scikit-Learn
Tools
Terraform, AWS, Apache Airflow, AWS Glue, AWS, Amazon EMR, MQTT, Amazon Elastic Container Service (ECS), Hadoop, Java, Jenkins, Hadoop
Languages
Scala, SQL, Python, Python, SQL DML, Java, T-SQL
Frameworks
Spark, Hadoop
Paradigms
ETL, Database Design, OLAP, API, Agile Development, Continuous Integration (CI)
Platforms
AWS, Jupyter Notebook, Apache Kafka, Kubernetes, Docker, Java EE, Talend, Oracle Development, Airbyte, AWS Lambda
Storage
Database, Data Lakes, Amazon S3, PostgreSQL, Database, Redshift, JSON, MongoDB, NoSQL, Hadoop, Elasticsearch, SQL Server, SQL Server
Other
Data Engineering, Data Migration, AWS Kinesis, Data Architecture, Big Data Architecture, Database Schema Design, Schemas, Pipelines, Data Build Tool (dbt), Security, Enterprise Architecture, AWS, Data Modeling, Modeling, Integration, Data Science, AWS RDS, Artificial Intelligence, Machine Learning, System Security, Metabase, Data Science
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring